public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API
@ 2024-11-07 16:51 Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 01/34] block/reqlist: allow adding overlapping requests Fiona Ebner
                   ` (34 more replies)
  0 siblings, 35 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Changes in v3:
* Add storage_has_feature() helper and use it to decide on whether the
  storage uses a backup provider, instead of having this be implicit
  with whether a backup provider is returned by new_backup_provider().
* Fix querying block-node size for fleecing in stop mode, by issuing
  the QMP command only after the VM is enforced running.
* Run backup_container() in user namespace associated to the
  container.
* And introduce 'prepare' phase for backup_hook() to be used to
  prepare for running in that user namespace context.
* Pass in guest and firewall config as raw data instead of by file
  name (so files don't have to be accessible in user namespace context
  for containers).
* Run restore of containers with 'directory' mechanism in user
  namespace switching from 'rsync' to 'tar' which is easier to "split"
  into a privileged and unprivileged half.
* Check potentially untrusted tar archives.
* Borg plugin: make SSH work and use that.

Changes in v2:
* Add 'block-device' backup mechansim for VMs. The NBD export is
  mounted by Proxmox VE and only the block device path (as well as a
  callback to get the next dirty range for bitmaps) is passed to the
  backup provider.
* Add POC example for Borg - note that I tested with borg 1.2.4 in
  Debian and only tested with a local repository, not SSH yet.
* Merge hook API into a single function for backup and for jobs.
* Add restore_vm_init() and restore_vm_cleanup() for better
  flexibility to allow preparing the whole restore. Question is
  if restore_vm_volume_init() and restore_vm_volume_cleanup() should
  be dropped (but certain providers might prefer using only those)?
  Having both is more flexible, but makes the API longer of course.
* Switch to backup_vm() (was per-volume backup_vm_volume() before) and
  backup_container(), passing along the configuration files, rather
  than having dedicated methods for the configuration files, for
  giving the backup provider more flexibility.
* Some renames in API methods/params to improve clarity.
* Pass backup time to backup 'start' hook and use that in the
  directory example rather than the job start time.
* Use POD for base plugin documentation and flesh out documentation.
* Use 'BackupProvider::Plugin::' namespace.
* Various smaller improvements in the directory provider example.

======

A backup provider needs to implement a storage plugin as well as a
backup provider plugin. The storage plugin is for integration in
Proxmox VE's front-end, so users can manage the backups via
UI/API/CLI. The backup provider plugin is for interfacing with the
backup provider's backend to integrate backup and restore with that
backend into Proxmox VE.

This is an initial draft of an API and required changes to the backup
stack in Proxmox VE to make it work. Depending on feedback from other
developers and interested parties, it can still substantially change.

======

The backup provider API is split into two parts, both of which again
need different implementations for VM and LXC guests:

1. Backup API

There are two hook callback functions, namely:
1. job_hook() is called during the start/end/abort phases of the whole
   backup job.
2. backup_hook() is called during the start/end/abort phases of the
   backup of an individual guest. There also is a 'prepare' phase
   useful for container backups, because the backup method for
   containers itself is executed in the user namespace context
   associated to the container.

The backup_get_mechanism() method is used to decide on the backup
mechanism. Currently, 'block-device' or 'nbd' for VMs, and 'directory'
for containers is possible. The method also let's the plugin indicate
whether to use a bitmap for incremental VM backup or not. It is enough
to implement one mechanism for VMs and one mechanism for containers.

Next, there are methods for backing up the guest's configuration and
data, backup_vm() for VM backup and backup_container() for container
backup, with the latter running

Finally, some helpers like getting the provider name or volume ID for
the backup target, as well as for handling the backup log.

1.1 Backup Mechanisms

VM:

Access to the data on the VM's disk from the time the backup started
is made available via a so-called "snapshot access". This is either
the full image, or in case a bitmap is used, the dirty parts of the
image since the last time the bitmap was used for a successful backup.
Reading outside of the dirty parts will result in an error. After
backing up each part of the disk, it should be discarded in the export
to avoid unnecessary space usage on the Proxmox VE side (there is an
associated fleecing image).

VM mechanism 'block-device':

The snapshot access is exposed as a block device. If used, a bitmap is
passed along.

VM mechanism 'nbd':

The snapshot access and, if used, bitmap are exported via NBD.

Container mechanism 'directory':

A copy or snapshot of the container's filesystem state is made
available as a directory. The method is executed inside the user
namespace associated to the container.

2. Restore API

The restore_get_mechanism() method is used to decide on the restore
mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for
containers are possible. It is enough to implement one mechanism for
VMs and one mechanism for containers.

Next, methods for extracting the guest and firewall configuration and
the implementations of the restore mechanism via a pair of methods: an
init method, for making the data available to Proxmox VE and a cleanup
method that is called after restore.

For VMs, there also is a restore_vm_get_device_info() helper required,
to get the disks included in the backup and their sizes.

2.1. Restore Mechanisms

VM mechanism 'qemu-img':

The backup provider gives a path to the disk image that will be
restored. The path needs to be something 'qemu-img' can deal with,
e.g. can also be an NBD URI or similar.

Container mechanism 'directory':

The backup provider gives the path to a directory with the full
filesystem structure of the container.

Container mechanism 'tar':

The backup provider gives the path to a (potentially compressed) tar
archive with the full filesystem structure of the container.

See the PVE::BackupProvider::Plugin module for the full API
documentation.

======

This series adapts the backup stack in Proxmox VE to allow using the
above API. For QEMU, backup access setup and teardown QMP commands are
implemented to be able to provide access to a consistent disk state to
the backup provider.

The series also provides an example implementation for a backup
provider as a proof-of-concept, exposing the different features.

======

Open questions:

Should the backup provider plugin system also follow the same API
age+version schema with a Custom/ directory for external plugins
derived from the base plugin?

Should the bitmap action be passed directly to the backup provider?
I.e. have 'not-used', 'not-used-removed', 'new', 'used', 'invalid',
instead of only 'none', 'new' and 'reuse'. It makes API slightly more
complicated. Is there any situation where backup provider could care
if bitmap is new, because it was the first or bitmap is new because
previous was invalid? Both cases require the backup provider to do a
full backup.

======

The patches marked as PATCH rather than RFC can make sense
independently, with QEMU patches 02 and 03 having been sent already
before (touching same code, so included here):

https://lore.proxmox.com/pve-devel/20240625133551.210636-1-f.ebner@proxmox.com/#r

======

Feedback is very welcome, especially from people wishing to implement
such a backup provider plugin! Please tell me what issues you see with
the proposed API, what would and wouldn't work from your perspective?

======

Dependencies: pve-manager, pve-container and qemu-server all depend on
new libpve-storage-perl. pve-manager also build-depends on the new
libpve-storage-perl for its tests. pve-container depends on new
pve-common. To keep things clean, pve-manager should also depend on
new pve-container and qemu-server.

In qemu-server, there is no version guard added yet, as that depends
on the QEMU version the feature will land in.

======

qemu:

Fiona Ebner (9):
  block/reqlist: allow adding overlapping requests
  PVE backup: fixup error handling for fleecing
  PVE backup: factor out setting up snapshot access for fleecing
  PVE backup: save device name in device info structure
  PVE backup: include device name in error when setting up snapshot
    access fails
  PVE backup: add target ID in backup state
  PVE backup: get device info: allow caller to specify filter for which
    devices use fleecing
  PVE backup: implement backup access setup and teardown API for
    external providers
  PVE backup: implement bitmap support for external backup access

 block/copy-before-write.c |   3 +-
 block/reqlist.c           |   2 -
 pve-backup.c              | 620 +++++++++++++++++++++++++++++++++-----
 pve-backup.h              |  16 +
 qapi/block-core.json      |  61 ++++
 system/runstate.c         |   6 +
 6 files changed, 637 insertions(+), 71 deletions(-)
 create mode 100644 pve-backup.h


common:

Fiona Ebner (1):
  env: add module with helpers to run a Perl subroutine in a user
    namespace

 src/Makefile   |   1 +
 src/PVE/Env.pm | 136 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+)
 create mode 100644 src/PVE/Env.pm


storage:

Fiona Ebner (5):
  add storage_has_feature() helper function
  plugin: introduce new_backup_provider() method
  extract backup config: delegate to backup provider for storages that
    support it
  add backup provider example
  WIP Borg plugin

 src/PVE/API2/Storage/Config.pm                |    2 +-
 src/PVE/BackupProvider/Makefile               |    3 +
 src/PVE/BackupProvider/Plugin/Base.pm         | 1158 +++++++++++++++++
 src/PVE/BackupProvider/Plugin/Borg.pm         |  439 +++++++
 .../BackupProvider/Plugin/DirectoryExample.pm |  697 ++++++++++
 src/PVE/BackupProvider/Plugin/Makefile        |    5 +
 src/PVE/Makefile                              |    1 +
 src/PVE/Storage.pm                            |   33 +-
 src/PVE/Storage/BorgBackupPlugin.pm           |  595 +++++++++
 .../Custom/BackupProviderDirExamplePlugin.pm  |  307 +++++
 src/PVE/Storage/Custom/Makefile               |    5 +
 src/PVE/Storage/Makefile                      |    2 +
 src/PVE/Storage/Plugin.pm                     |   25 +
 13 files changed, 3269 insertions(+), 3 deletions(-)
 create mode 100644 src/PVE/BackupProvider/Makefile
 create mode 100644 src/PVE/BackupProvider/Plugin/Base.pm
 create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm
 create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm
 create mode 100644 src/PVE/BackupProvider/Plugin/Makefile
 create mode 100644 src/PVE/Storage/BorgBackupPlugin.pm
 create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
 create mode 100644 src/PVE/Storage/Custom/Makefile


qemu-server:

Fiona Ebner (9):
  move nbd_stop helper to QMPHelpers module
  backup: move cleanup of fleecing images to cleanup method
  backup: cleanup: check if VM is running before issuing QMP commands
  backup: keep track of block-node size for fleecing
  backup: allow adding fleecing images also for EFI and TPM
  backup: implement backup for external providers
  restore: die early when there is no size for a device
  backup: implement restore for external providers
  backup restore: external: hardening check for untrusted source image

 PVE/API2/Qemu.pm             |  33 ++-
 PVE/CLI/qm.pm                |   3 +-
 PVE/QemuServer.pm            | 152 +++++++++++++-
 PVE/QemuServer/QMPHelpers.pm |   6 +
 PVE/VZDump/QemuServer.pm     | 382 ++++++++++++++++++++++++++++++++---
 5 files changed, 539 insertions(+), 37 deletions(-)


container:

Fiona Ebner (8):
  create: add missing include of PVE::Storage::Plugin
  backup: implement backup for external providers
  create: factor out tar restore command helper
  backup: implement restore for external providers
  external restore: don't use 'one-file-system' tar flag when restoring
    from a directory
  create: factor out compression option helper
  restore tar archive: check potentially untrusted archive
  api: add early check against restoring privileged container from
    external source

 src/PVE/API2/LXC.pm   |  14 +++
 src/PVE/LXC/Create.pm | 284 +++++++++++++++++++++++++++++++++++++-----
 src/PVE/VZDump/LXC.pm |  38 +++++-
 3 files changed, 304 insertions(+), 32 deletions(-)


manager:

Fiona Ebner (2):
  ui: backup: also check for backup subtype to classify archive
  backup: implement backup for external providers

 PVE/VZDump.pm                      | 57 ++++++++++++++++++++++++++----
 test/vzdump_new_test.pl            |  3 ++
 www/manager6/Utils.js              | 10 +++---
 www/manager6/grid/BackupView.js    |  4 +--
 www/manager6/storage/BackupView.js |  4 +--
 5 files changed, 63 insertions(+), 15 deletions(-)


Summary over all repositories:
  34 files changed, 4949 insertions(+), 158 deletions(-)

-- 
Generated by git-murpp 0.5.0


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu v3 01/34] block/reqlist: allow adding overlapping requests
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 02/34] PVE backup: fixup error handling for fleecing Fiona Ebner
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Allow overlapping request by removing the assert that made it
impossible. There are only two callers:

1. block_copy_task_create()

It already asserts the very same condition before calling
reqlist_init_req().

2. cbw_snapshot_read_lock()

There is no need to have read requests be non-overlapping in
copy-before-write when used for snapshot-access. In fact, there was no
protection against two callers of cbw_snapshot_read_lock() calling
reqlist_init_req() with overlapping ranges and this could lead to an
assertion failure [1].

In particular, with the reproducer script below [0], two
cbw_co_snapshot_block_status() callers could race, with the second
calling reqlist_init_req() before the first one finishes and removes
its conflicting request.

[0]:

> #!/bin/bash -e
> dd if=/dev/urandom of=/tmp/disk.raw bs=1M count=1024
> ./qemu-img create /tmp/fleecing.raw -f raw 1G
> (
> ./qemu-system-x86_64 --qmp stdio \
> --blockdev raw,node-name=node0,file.driver=file,file.filename=/tmp/disk.raw \
> --blockdev raw,node-name=node1,file.driver=file,file.filename=/tmp/fleecing.raw \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "blockdev-add", "arguments": { "driver": "copy-before-write", "file": "node0", "target": "node1", "node-name": "node3" } }
> {"execute": "blockdev-add", "arguments": { "driver": "snapshot-access", "file": "node3", "node-name": "snap0" } }
> {"execute": "nbd-server-start", "arguments": {"addr": { "type": "unix", "data": { "path": "/tmp/nbd.socket" } } } }
> {"execute": "block-export-add", "arguments": {"id": "exp0", "node-name": "snap0", "type": "nbd", "name": "exp0"}}
> EOF
> ) &
> sleep 5
> while true; do
> ./qemu-nbd -d /dev/nbd0
> ./qemu-nbd -c /dev/nbd0 nbd:unix:/tmp/nbd.socket:exportname=exp0 -f raw -r
> nbdinfo --map 'nbd+unix:///exp0?socket=/tmp/nbd.socket'
> done

[1]:

> #5  0x000071e5f0088eb2 in __GI___assert_fail (...) at ./assert/assert.c:101
> #6  0x0000615285438017 in reqlist_init_req (...) at ../block/reqlist.c:23
> #7  0x00006152853e2d98 in cbw_snapshot_read_lock (...) at ../block/copy-before-write.c:237
> #8  0x00006152853e3068 in cbw_co_snapshot_block_status (...) at ../block/copy-before-write.c:304
> #9  0x00006152853f4d22 in bdrv_co_snapshot_block_status (...) at ../block/io.c:3726
> #10 0x000061528543a63e in snapshot_access_co_block_status (...) at ../block/snapshot-access.c:48
> #11 0x00006152853f1a0a in bdrv_co_do_block_status (...) at ../block/io.c:2474
> #12 0x00006152853f2016 in bdrv_co_common_block_status_above (...) at ../block/io.c:2652
> #13 0x00006152853f22cf in bdrv_co_block_status_above (...) at ../block/io.c:2732
> #14 0x00006152853d9a86 in blk_co_block_status_above (...) at ../block/block-backend.c:1473
> #15 0x000061528538da6c in blockstatus_to_extents (...) at ../nbd/server.c:2374
> #16 0x000061528538deb1 in nbd_co_send_block_status (...) at ../nbd/server.c:2481
> #17 0x000061528538f424 in nbd_handle_request (...) at ../nbd/server.c:2978
> #18 0x000061528538f906 in nbd_trip (...) at ../nbd/server.c:3121
> #19 0x00006152855a7caf in coroutine_trampoline (...) at ../util/coroutine-ucontext.c:175

Cc: qemu-stable@nongnu.org
Suggested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---

No changes in v3.

 block/copy-before-write.c | 3 ++-
 block/reqlist.c           | 2 --
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 50cc4c7aae..a5bb4d14f6 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -67,7 +67,8 @@ typedef struct BDRVCopyBeforeWriteState {
 
     /*
      * @frozen_read_reqs: current read requests for fleecing user in bs->file
-     * node. These areas must not be rewritten by guest.
+     * node. These areas must not be rewritten by guest. There can be multiple
+     * overlapping read requests.
      */
     BlockReqList frozen_read_reqs;
 
diff --git a/block/reqlist.c b/block/reqlist.c
index 08cb57cfa4..098e807378 100644
--- a/block/reqlist.c
+++ b/block/reqlist.c
@@ -20,8 +20,6 @@
 void reqlist_init_req(BlockReqList *reqs, BlockReq *req, int64_t offset,
                       int64_t bytes)
 {
-    assert(!reqlist_find_conflict(reqs, offset, bytes));
-
     *req = (BlockReq) {
         .offset = offset,
         .bytes = bytes,
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu v3 02/34] PVE backup: fixup error handling for fleecing
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 01/34] block/reqlist: allow adding overlapping requests Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 03/34] PVE backup: factor out setting up snapshot access " Fiona Ebner
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

The drained section needs to be terminated before breaking out of the
loop in the error scenarios. Otherwise, guest IO on the drive would
become stuck.

If the job is created successfully, then the job completion callback
will clean up the snapshot access block nodes. In case failure
happened before the job is created, there was no cleanup for the
snapshot access block nodes yet. Add it.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c | 38 +++++++++++++++++++++++++-------------
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index 4e730aa3da..c4178758b3 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -357,22 +357,23 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
     qemu_co_mutex_unlock(&backup_state.backup_mutex);
 }
 
+static void cleanup_snapshot_access(PVEBackupDevInfo *di)
+{
+    if (di->fleecing.snapshot_access) {
+        bdrv_unref(di->fleecing.snapshot_access);
+        di->fleecing.snapshot_access = NULL;
+    }
+    if (di->fleecing.cbw) {
+        bdrv_cbw_drop(di->fleecing.cbw);
+        di->fleecing.cbw = NULL;
+    }
+}
+
 static void pvebackup_complete_cb(void *opaque, int ret)
 {
     PVEBackupDevInfo *di = opaque;
     di->completed_ret = ret;
 
-    /*
-     * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
-     * won't be done as a coroutine anyways:
-     * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
-     *   just spawn a BH calling bdrv_unref().
-     * - For cbw, draining would need to spawn a BH.
-     */
-    if (di->fleecing.snapshot_access) {
-        bdrv_unref(di->fleecing.snapshot_access);
-        di->fleecing.snapshot_access = NULL;
-    }
     if (di->fleecing.cbw) {
         /*
          * With fleecing, failure for cbw does not fail the guest write, but only sets the snapshot
@@ -383,10 +384,17 @@ static void pvebackup_complete_cb(void *opaque, int ret)
         if (di->completed_ret == -EACCES && snapshot_error) {
             di->completed_ret = snapshot_error;
         }
-        bdrv_cbw_drop(di->fleecing.cbw);
-        di->fleecing.cbw = NULL;
     }
 
+    /*
+     * Handle block-graph specific cleanup (for fleecing) outside of the coroutine, because the work
+     * won't be done as a coroutine anyways:
+     * - For snapshot_access, allows doing bdrv_unref() directly. Doing it via bdrv_co_unref() would
+     *   just spawn a BH calling bdrv_unref().
+     * - For cbw, draining would need to spawn a BH.
+     */
+    cleanup_snapshot_access(di);
+
     /*
      * Needs to happen outside of coroutine, because it takes the graph write lock.
      */
@@ -587,6 +595,7 @@ static void create_backup_jobs_bh(void *opaque) {
             if (!di->fleecing.cbw) {
                 error_setg(errp, "appending cbw node for fleecing failed: %s",
                            local_err ? error_get_pretty(local_err) : "unknown error");
+                bdrv_drained_end(di->bs);
                 break;
             }
 
@@ -599,6 +608,8 @@ static void create_backup_jobs_bh(void *opaque) {
             if (!di->fleecing.snapshot_access) {
                 error_setg(errp, "setting up snapshot access for fleecing failed: %s",
                            local_err ? error_get_pretty(local_err) : "unknown error");
+                cleanup_snapshot_access(di);
+                bdrv_drained_end(di->bs);
                 break;
             }
             source_bs = di->fleecing.snapshot_access;
@@ -637,6 +648,7 @@ static void create_backup_jobs_bh(void *opaque) {
         }
 
         if (!job || local_err) {
+            cleanup_snapshot_access(di);
             error_setg(errp, "backup_job_create failed: %s",
                        local_err ? error_get_pretty(local_err) : "null");
             break;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu v3 03/34] PVE backup: factor out setting up snapshot access for fleecing
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 01/34] block/reqlist: allow adding overlapping requests Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 02/34] PVE backup: fixup error handling for fleecing Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 04/34] PVE backup: save device name in device info structure Fiona Ebner
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Avoids some line bloat in the create_backup_jobs_bh() function and is
in preparation for setting up the snapshot access independently of
fleecing, in particular that will be useful for providing access to
the snapshot via NBD.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c | 95 ++++++++++++++++++++++++++++++++--------------------
 1 file changed, 58 insertions(+), 37 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index c4178758b3..051ebffe48 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -525,6 +525,62 @@ static int coroutine_fn pvebackup_co_add_config(
     goto out;
 }
 
+/*
+ * Setup a snapshot-access block node for a device with associated fleecing image.
+ */
+static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
+{
+    Error *local_err = NULL;
+
+    if (!di->fleecing.bs) {
+        error_setg(errp, "no associated fleecing image");
+        return -1;
+    }
+
+    QDict *cbw_opts = qdict_new();
+    qdict_put_str(cbw_opts, "driver", "copy-before-write");
+    qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
+    qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
+
+    if (di->bitmap) {
+        /*
+         * Only guest writes to parts relevant for the backup need to be intercepted with
+         * old data being copied to the fleecing image.
+         */
+        qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
+        qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
+    }
+    /*
+     * Fleecing storage is supposed to be fast and it's better to break backup than guest
+     * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
+     * abort a bit before that.
+     */
+    qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
+    qdict_put_int(cbw_opts, "cbw-timeout", 45);
+
+    di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
+
+    if (!di->fleecing.cbw) {
+        error_setg(errp, "appending cbw node for fleecing failed: %s",
+                   local_err ? error_get_pretty(local_err) : "unknown error");
+        return -1;
+    }
+
+    QDict *snapshot_access_opts = qdict_new();
+    qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
+    qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
+
+    di->fleecing.snapshot_access =
+        bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
+    if (!di->fleecing.snapshot_access) {
+        error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+                   local_err ? error_get_pretty(local_err) : "unknown error");
+        return -1;
+    }
+
+    return 0;
+}
+
 /*
  * backup_job_create can *not* be run from a coroutine, so this can't either.
  * The caller is responsible that backup_mutex is held nonetheless.
@@ -569,49 +625,14 @@ static void create_backup_jobs_bh(void *opaque) {
         const char *job_id = bdrv_get_device_name(di->bs);
         bdrv_graph_co_rdunlock();
         if (di->fleecing.bs) {
-            QDict *cbw_opts = qdict_new();
-            qdict_put_str(cbw_opts, "driver", "copy-before-write");
-            qdict_put_str(cbw_opts, "file", bdrv_get_node_name(di->bs));
-            qdict_put_str(cbw_opts, "target", bdrv_get_node_name(di->fleecing.bs));
-
-            if (di->bitmap) {
-                /*
-                 * Only guest writes to parts relevant for the backup need to be intercepted with
-                 * old data being copied to the fleecing image.
-                 */
-                qdict_put_str(cbw_opts, "bitmap.node", bdrv_get_node_name(di->bs));
-                qdict_put_str(cbw_opts, "bitmap.name", bdrv_dirty_bitmap_name(di->bitmap));
-            }
-            /*
-             * Fleecing storage is supposed to be fast and it's better to break backup than guest
-             * writes. Certain guest drivers like VirtIO-win have 60 seconds timeout by default, so
-             * abort a bit before that.
-             */
-            qdict_put_str(cbw_opts, "on-cbw-error", "break-snapshot");
-            qdict_put_int(cbw_opts, "cbw-timeout", 45);
-
-            di->fleecing.cbw = bdrv_insert_node(di->bs, cbw_opts, BDRV_O_RDWR, &local_err);
-
-            if (!di->fleecing.cbw) {
-                error_setg(errp, "appending cbw node for fleecing failed: %s",
-                           local_err ? error_get_pretty(local_err) : "unknown error");
-                bdrv_drained_end(di->bs);
-                break;
-            }
-
-            QDict *snapshot_access_opts = qdict_new();
-            qdict_put_str(snapshot_access_opts, "driver", "snapshot-access");
-            qdict_put_str(snapshot_access_opts, "file", bdrv_get_node_name(di->fleecing.cbw));
-
-            di->fleecing.snapshot_access =
-                bdrv_open(NULL, NULL, snapshot_access_opts, BDRV_O_RDWR | BDRV_O_UNMAP, &local_err);
-            if (!di->fleecing.snapshot_access) {
+            if (setup_snapshot_access(di, &local_err) < 0) {
                 error_setg(errp, "setting up snapshot access for fleecing failed: %s",
                            local_err ? error_get_pretty(local_err) : "unknown error");
                 cleanup_snapshot_access(di);
                 bdrv_drained_end(di->bs);
                 break;
             }
+
             source_bs = di->fleecing.snapshot_access;
             discard_source = true;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu v3 04/34] PVE backup: save device name in device info structure
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (2 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 03/34] PVE backup: factor out setting up snapshot access " Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 05/34] PVE backup: include device name in error when setting up snapshot access fails Fiona Ebner
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

The device name needs to be queried while holding the graph read lock
and since it doesn't change during the whole operation, just get it
once during setup and avoid the need to query it again in different
places.

Also in preparation to use it more often in error messages and for the
upcoming external backup access API.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index 051ebffe48..33c23e53c2 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -94,6 +94,7 @@ typedef struct PVEBackupDevInfo {
     size_t size;
     uint64_t block_size;
     uint8_t dev_id;
+    char* device_name;
     int completed_ret; // INT_MAX if not completed
     BdrvDirtyBitmap *bitmap;
     BlockDriverState *target;
@@ -327,6 +328,8 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
     }
 
     di->bs = NULL;
+    g_free(di->device_name);
+    di->device_name = NULL;
 
     assert(di->target == NULL);
 
@@ -621,9 +624,6 @@ static void create_backup_jobs_bh(void *opaque) {
 
         BlockDriverState *source_bs = di->bs;
         bool discard_source = false;
-        bdrv_graph_co_rdlock();
-        const char *job_id = bdrv_get_device_name(di->bs);
-        bdrv_graph_co_rdunlock();
         if (di->fleecing.bs) {
             if (setup_snapshot_access(di, &local_err) < 0) {
                 error_setg(errp, "setting up snapshot access for fleecing failed: %s",
@@ -654,7 +654,7 @@ static void create_backup_jobs_bh(void *opaque) {
         }
 
         BlockJob *job = backup_job_create(
-            job_id, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
+            di->device_name, source_bs, di->target, backup_state.speed, sync_mode, di->bitmap,
             bitmap_mode, false, discard_source, NULL, &perf, BLOCKDEV_ON_ERROR_REPORT,
             BLOCKDEV_ON_ERROR_REPORT, JOB_DEFAULT, pvebackup_complete_cb, di, backup_state.txn,
             &local_err);
@@ -751,6 +751,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
             }
             PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
             di->bs = bs;
+            di->device_name = g_strdup(bdrv_get_device_name(bs));
 
             if (fleecing && device_uses_fleecing(*d)) {
                 g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
@@ -789,6 +790,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
 
             PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1);
             di->bs = bs;
+            di->device_name = g_strdup(bdrv_get_device_name(bs));
             di_list = g_list_append(di_list, di);
         }
     }
@@ -956,9 +958,6 @@ UuidInfo coroutine_fn *qmp_backup(
 
             di->block_size = dump_cb_block_size;
 
-            bdrv_graph_co_rdlock();
-            const char *devname = bdrv_get_device_name(di->bs);
-            bdrv_graph_co_rdunlock();
             PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
             size_t dirty = di->size;
 
@@ -973,7 +972,8 @@ UuidInfo coroutine_fn *qmp_backup(
                     }
                     action = PBS_BITMAP_ACTION_NEW;
                 } else {
-                    expect_only_dirty = proxmox_backup_check_incremental(pbs, devname, di->size) != 0;
+                    expect_only_dirty =
+                        proxmox_backup_check_incremental(pbs, di->device_name, di->size) != 0;
                 }
 
                 if (expect_only_dirty) {
@@ -997,7 +997,8 @@ UuidInfo coroutine_fn *qmp_backup(
                 }
             }
 
-            int dev_id = proxmox_backup_co_register_image(pbs, devname, di->size, expect_only_dirty, errp);
+            int dev_id = proxmox_backup_co_register_image(pbs, di->device_name, di->size,
+                                                          expect_only_dirty, errp);
             if (dev_id < 0) {
                 goto err_mutex;
             }
@@ -1009,7 +1010,7 @@ UuidInfo coroutine_fn *qmp_backup(
             di->dev_id = dev_id;
 
             PBSBitmapInfo *info = g_malloc(sizeof(*info));
-            info->drive = g_strdup(devname);
+            info->drive = g_strdup(di->device_name);
             info->action = action;
             info->size = di->size;
             info->dirty = dirty;
@@ -1034,10 +1035,7 @@ UuidInfo coroutine_fn *qmp_backup(
                 goto err_mutex;
             }
 
-            bdrv_graph_co_rdlock();
-            const char *devname = bdrv_get_device_name(di->bs);
-            bdrv_graph_co_rdunlock();
-            di->dev_id = vma_writer_register_stream(vmaw, devname, di->size);
+            di->dev_id = vma_writer_register_stream(vmaw, di->device_name, di->size);
             if (di->dev_id <= 0) {
                 error_set(errp, ERROR_CLASS_GENERIC_ERROR,
                           "register_stream failed");
@@ -1148,6 +1146,9 @@ err:
             bdrv_co_unref(di->target);
         }
 
+        g_free(di->device_name);
+        di->device_name = NULL;
+
         g_free(di);
     }
     g_list_free(di_list);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu v3 05/34] PVE backup: include device name in error when setting up snapshot access fails
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (3 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 04/34] PVE backup: save device name in device info structure Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state Fiona Ebner
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/pve-backup.c b/pve-backup.c
index 33c23e53c2..d931746453 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -626,7 +626,8 @@ static void create_backup_jobs_bh(void *opaque) {
         bool discard_source = false;
         if (di->fleecing.bs) {
             if (setup_snapshot_access(di, &local_err) < 0) {
-                error_setg(errp, "setting up snapshot access for fleecing failed: %s",
+                error_setg(errp, "%s - setting up snapshot access for fleecing failed: %s",
+                           di->device_name,
                            local_err ? error_get_pretty(local_err) : "unknown error");
                 cleanup_snapshot_access(di);
                 bdrv_drained_end(di->bs);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (4 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 05/34] PVE backup: include device name in error when setting up snapshot access fails Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12 16:46   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 07/34] PVE backup: get device info: allow caller to specify filter for which devices use fleecing Fiona Ebner
                   ` (28 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

In preparation for allowing multiple backup providers. Each backup
target can then have its own dirty bitmap and there can be additional
checks that the current backup state is actually associated to the
expected target.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/pve-backup.c b/pve-backup.c
index d931746453..e8031bb89c 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -70,6 +70,7 @@ static struct PVEBackupState {
     JobTxn *txn;
     CoMutex backup_mutex;
     CoMutex dump_callback_mutex;
+    char *target_id;
 } backup_state;
 
 static void pvebackup_init(void)
@@ -848,7 +849,7 @@ UuidInfo coroutine_fn *qmp_backup(
 
     if (backup_state.di_list) {
         error_set(errp, ERROR_CLASS_GENERIC_ERROR,
-                  "previous backup not finished");
+                  "previous backup by provider '%s' not finished", backup_state.target_id);
         qemu_co_mutex_unlock(&backup_state.backup_mutex);
         return NULL;
     }
@@ -1100,6 +1101,11 @@ UuidInfo coroutine_fn *qmp_backup(
     backup_state.vmaw = vmaw;
     backup_state.pbs = pbs;
 
+    if (backup_state.target_id) {
+        g_free(backup_state.target_id);
+    }
+    backup_state.target_id = g_strdup("Proxmox");
+
     backup_state.di_list = di_list;
 
     uuid_info = g_malloc0(sizeof(*uuid_info));
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu v3 07/34] PVE backup: get device info: allow caller to specify filter for which devices use fleecing
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (5 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 08/34] PVE backup: implement backup access setup and teardown API for external providers Fiona Ebner
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

For providing snapshot-access to external backup providers, EFI and
TPM also need an associated fleecing image. The new caller will thus
need a different filter.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index e8031bb89c..d0593fc581 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -717,7 +717,7 @@ static void create_backup_jobs_bh(void *opaque) {
 /*
  * EFI disk and TPM state are small and it's just not worth setting up fleecing for them.
  */
-static bool device_uses_fleecing(const char *device_id)
+static bool fleecing_no_efi_tpm(const char *device_id)
 {
     return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14);
 }
@@ -729,7 +729,7 @@ static bool device_uses_fleecing(const char *device_id)
  */
 static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
     const char *devlist,
-    bool fleecing,
+    bool (*device_uses_fleecing)(const char*),
     Error **errp)
 {
     gchar **devs = NULL;
@@ -755,7 +755,7 @@ static GList coroutine_fn GRAPH_RDLOCK *get_device_info(
             di->bs = bs;
             di->device_name = g_strdup(bdrv_get_device_name(bs));
 
-            if (fleecing && device_uses_fleecing(*d)) {
+            if (device_uses_fleecing && device_uses_fleecing(*d)) {
                 g_autofree gchar *fleecing_devid = g_strconcat(*d, "-fleecing", NULL);
                 BlockBackend *fleecing_blk = blk_by_name(fleecing_devid);
                 if (!fleecing_blk) {
@@ -858,7 +858,8 @@ UuidInfo coroutine_fn *qmp_backup(
     format = has_format ? format : BACKUP_FORMAT_VMA;
 
     bdrv_graph_co_rdlock();
-    di_list = get_device_info(devlist, has_fleecing && fleecing, &local_err);
+    di_list = get_device_info(devlist, (has_fleecing && fleecing) ? fleecing_no_efi_tpm : NULL,
+                              &local_err);
     bdrv_graph_co_rdunlock();
     if (local_err) {
         error_propagate(errp, local_err);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu v3 08/34] PVE backup: implement backup access setup and teardown API for external providers
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (6 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 07/34] PVE backup: get device info: allow caller to specify filter for which devices use fleecing Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 09/34] PVE backup: implement bitmap support for external backup access Fiona Ebner
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

For external backup providers, the state of the VM's disk images at
the time the backup is started is preserved via a snapshot-access
block node. Old data is moved to the fleecing image when new guest
writes come in. The snapshot-access block node, as well as the
associated bitmap in case of incremental backup, will be exported via
NBD to the external provider. The NBD export will be done by the
management layer, the missing functionality is setting up and tearing
down the snapshot-access block nodes, which this patch adds.

It is necessary to also set up fleecing for EFI and TPM disks, so that
old data can be moved out of the way when a new guest write comes in.

There can only be one regular backup or one active backup access at
a time, because both require replacing the original block node of the
drive. Thus the backup state is re-used, and checks are added to
prohibit regular backup while snapshot access is active and vice
versa.

The block nodes added by the backup-access-setup QMP call are not
tracked anywhere else (there is no job they are associated to like for
regular backup). This requires adding a callback for teardown when
QEMU exits, i.e. in qemu_cleanup(). Otherwise, there will be an
assertion failure that the block graph is not empty when QEMU exits
before the backup-access-teardown QMP command is called.

The code for the qmp_backup_access_setup() was based on the existing
qmp_backup() routine.

The return value for the setup QMP command contains information about
the snapshot-access block nodes that can be used by the management
layer to set up the NBD exports.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c         | 273 +++++++++++++++++++++++++++++++++++++++++++
 pve-backup.h         |  16 +++
 qapi/block-core.json |  45 +++++++
 system/runstate.c    |   6 +
 4 files changed, 340 insertions(+)
 create mode 100644 pve-backup.h

diff --git a/pve-backup.c b/pve-backup.c
index d0593fc581..d3370d6744 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -1,4 +1,5 @@
 #include "proxmox-backup-client.h"
+#include "pve-backup.h"
 #include "vma.h"
 
 #include "qemu/osdep.h"
@@ -585,6 +586,37 @@ static int setup_snapshot_access(PVEBackupDevInfo *di, Error **errp)
     return 0;
 }
 
+static void setup_all_snapshot_access_bh(void *opaque)
+{
+    assert(!qemu_in_coroutine());
+
+    CoCtxData *data = (CoCtxData*)opaque;
+    Error **errp = (Error**)data->data;
+
+    Error *local_err = NULL;
+
+    GList *l =  backup_state.di_list;
+    while (l) {
+        PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+        l = g_list_next(l);
+
+        bdrv_drained_begin(di->bs);
+
+        if (setup_snapshot_access(di, &local_err) < 0) {
+            cleanup_snapshot_access(di);
+            bdrv_drained_end(di->bs);
+            error_setg(errp, "%s - setting up snapshot access failed: %s", di->device_name,
+                       local_err ? error_get_pretty(local_err) : "unknown error");
+            break;
+        }
+
+        bdrv_drained_end(di->bs);
+    }
+
+    /* return */
+    aio_co_enter(data->ctx, data->co);
+}
+
 /*
  * backup_job_create can *not* be run from a coroutine, so this can't either.
  * The caller is responsible that backup_mutex is held nonetheless.
@@ -722,6 +754,11 @@ static bool fleecing_no_efi_tpm(const char *device_id)
     return strncmp(device_id, "drive-efidisk", 13) && strncmp(device_id, "drive-tpmstate", 14);
 }
 
+static bool fleecing_all(const char *device_id)
+{
+    return true;
+}
+
 /*
  * Returns a list of device infos, which needs to be freed by the caller. In
  * case of an error, errp will be set, but the returned value might still be a
@@ -810,6 +847,242 @@ err:
     return di_list;
 }
 
+BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
+    const char *target_id,
+    const char *devlist,
+    Error **errp)
+{
+    assert(qemu_in_coroutine());
+
+    qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+    Error *local_err = NULL;
+    GList *di_list = NULL;
+    GList *l;
+
+    if (backup_state.di_list) {
+        error_set(errp, ERROR_CLASS_GENERIC_ERROR,
+                  "previous backup by provider '%s' not finished", backup_state.target_id);
+        qemu_co_mutex_unlock(&backup_state.backup_mutex);
+        return NULL;
+    }
+
+    bdrv_graph_co_rdlock();
+    di_list = get_device_info(devlist, fleecing_all, &local_err);
+    bdrv_graph_co_rdunlock();
+    if (local_err) {
+        error_propagate(errp, local_err);
+        goto err;
+    }
+    assert(di_list);
+
+    size_t total = 0;
+
+    l = di_list;
+    while (l) {
+        PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+        l = g_list_next(l);
+
+        ssize_t size = bdrv_getlength(di->bs);
+        if (size < 0) {
+            error_setg_errno(errp, -size, "bdrv_getlength failed");
+            goto err;
+        }
+        di->size = size;
+        total += size;
+
+        di->completed_ret = INT_MAX;
+    }
+
+    qemu_mutex_lock(&backup_state.stat.lock);
+    backup_state.stat.reused = 0;
+
+    /* clear previous backup's bitmap_list */
+    if (backup_state.stat.bitmap_list) {
+        GList *bl = backup_state.stat.bitmap_list;
+        while (bl) {
+            g_free(((PBSBitmapInfo *)bl->data)->drive);
+            g_free(bl->data);
+            bl = g_list_next(bl);
+        }
+        g_list_free(backup_state.stat.bitmap_list);
+        backup_state.stat.bitmap_list = NULL;
+    }
+
+    /* initialize global backup_state now */
+
+    if (backup_state.stat.error) {
+        error_free(backup_state.stat.error);
+        backup_state.stat.error = NULL;
+    }
+
+    backup_state.stat.start_time = time(NULL);
+    backup_state.stat.end_time = 0;
+
+    if (backup_state.stat.backup_file) {
+        g_free(backup_state.stat.backup_file);
+    }
+    backup_state.stat.backup_file = NULL;
+
+    if (backup_state.target_id) {
+        g_free(backup_state.target_id);
+    }
+    backup_state.target_id = g_strdup(target_id);
+
+    /*
+     * The stats will never update, because there is no internal backup job. Initialize them anyway
+     * for completeness.
+     */
+    backup_state.stat.total = total;
+    backup_state.stat.dirty = total - backup_state.stat.reused;
+    backup_state.stat.transferred = 0;
+    backup_state.stat.zero_bytes = 0;
+    backup_state.stat.finishing = false;
+    backup_state.stat.starting = false; // there's no associated QEMU job
+
+    qemu_mutex_unlock(&backup_state.stat.lock);
+
+    backup_state.vmaw = NULL;
+    backup_state.pbs = NULL;
+
+    backup_state.di_list = di_list;
+
+    /* Run setup_all_snapshot_access_bh outside of coroutine (in BH) but keep
+    * backup_mutex locked. This is fine, a CoMutex can be held across yield
+    * points, and we'll release it as soon as the BH reschedules us.
+    */
+    CoCtxData waker = {
+        .co = qemu_coroutine_self(),
+        .ctx = qemu_get_current_aio_context(),
+        .data = &local_err,
+    };
+    aio_bh_schedule_oneshot(waker.ctx, setup_all_snapshot_access_bh, &waker);
+    qemu_coroutine_yield();
+
+    if (local_err) {
+        error_propagate(errp, local_err);
+        goto err;
+    }
+
+    qemu_co_mutex_unlock(&backup_state.backup_mutex);
+
+    BackupAccessInfoList *bai_head = NULL, **p_bai_next = &bai_head;
+
+    l = di_list;
+    while (l) {
+        PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+        l = g_list_next(l);
+
+        BackupAccessInfoList *info = g_malloc0(sizeof(*info));
+        info->value = g_malloc0(sizeof(*info->value));
+        info->value->node_name = g_strdup(bdrv_get_node_name(di->fleecing.snapshot_access));
+        info->value->device = g_strdup(di->device_name);
+        info->value->size = di->size;
+
+        *p_bai_next = info;
+        p_bai_next = &info->next;
+    }
+
+    return bai_head;
+
+err:
+
+    l = di_list;
+    while (l) {
+        PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+        l = g_list_next(l);
+
+        g_free(di->device_name);
+        di->device_name = NULL;
+
+        g_free(di);
+    }
+    g_list_free(di_list);
+    backup_state.di_list = NULL;
+
+    qemu_co_mutex_unlock(&backup_state.backup_mutex);
+    return NULL;
+}
+
+/*
+ * Caller needs to hold the backup mutex or the BQL.
+ */
+void backup_access_teardown(void)
+{
+    GList *l = backup_state.di_list;
+
+    while (l) {
+        PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+        l = g_list_next(l);
+
+        if (di->fleecing.snapshot_access) {
+            bdrv_unref(di->fleecing.snapshot_access);
+            di->fleecing.snapshot_access = NULL;
+        }
+        if (di->fleecing.cbw) {
+            bdrv_cbw_drop(di->fleecing.cbw);
+            di->fleecing.cbw = NULL;
+        }
+
+        g_free(di->device_name);
+        di->device_name = NULL;
+
+        g_free(di);
+    }
+    g_list_free(backup_state.di_list);
+    backup_state.di_list = NULL;
+}
+
+// Not done in a coroutine, because bdrv_co_unref() and cbw_drop() would just spawn BHs anyways.
+// Caller needs to hold the backup_state.backup_mutex lock
+static void backup_access_teardown_bh(void *opaque)
+{
+    CoCtxData *data = (CoCtxData*)opaque;
+
+    backup_access_teardown();
+
+    /* return */
+    aio_co_enter(data->ctx, data->co);
+}
+
+void coroutine_fn qmp_backup_access_teardown(const char *target_id, Error **errp)
+{
+    assert(qemu_in_coroutine());
+
+    qemu_co_mutex_lock(&backup_state.backup_mutex);
+
+    if (!backup_state.target_id) { // nothing to do
+        qemu_co_mutex_unlock(&backup_state.backup_mutex);
+        return;
+    }
+
+    /*
+     * Continue with target_id == NULL, used by the callback registered for qemu_cleanup()
+     */
+    if (target_id && strcmp(backup_state.target_id, target_id)) {
+        error_setg(errp, "cannot teardown backup access - got provider %s instead of %s",
+                   target_id, backup_state.target_id);
+        qemu_co_mutex_unlock(&backup_state.backup_mutex);
+        return;
+    }
+
+    if (!strcmp(backup_state.target_id, "Proxmox VE")) {
+        error_setg(errp, "cannot teardown backup access for PVE - use backup-cancel instead");
+        qemu_co_mutex_unlock(&backup_state.backup_mutex);
+        return;
+    }
+
+    CoCtxData waker = {
+        .co = qemu_coroutine_self(),
+        .ctx = qemu_get_current_aio_context(),
+    };
+    aio_bh_schedule_oneshot(waker.ctx, backup_access_teardown_bh, &waker);
+    qemu_coroutine_yield();
+
+    qemu_co_mutex_unlock(&backup_state.backup_mutex);
+    return;
+}
+
 UuidInfo coroutine_fn *qmp_backup(
     const char *backup_file,
     const char *password,
diff --git a/pve-backup.h b/pve-backup.h
new file mode 100644
index 0000000000..4033bc848f
--- /dev/null
+++ b/pve-backup.h
@@ -0,0 +1,16 @@
+/*
+ * Bacup code used by Proxmox VE
+ *
+ * Copyright (C) Proxmox Server Solutions
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef PVE_BACKUP_H
+#define PVE_BACKUP_H
+
+void backup_access_teardown(void);
+
+#endif /* PVE_BACKUP_H */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ff441d4258..68f8da3144 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1098,6 +1098,51 @@
 ##
 { 'command': 'query-pbs-bitmap-info', 'returns': ['PBSBitmapInfo'] }
 
+##
+# @BackupAccessInfo:
+#
+# Info associated to a snapshot access for backup.  For more information about
+# the bitmap see @BackupAccessBitmapMode.
+#
+# @node-name: the block node name of the snapshot-access node.
+#
+# @device: the device on top of which the snapshot access was created.
+#
+# @size: the size of the block device in bytes.
+#
+##
+{ 'struct': 'BackupAccessInfo',
+  'data': { 'node-name': 'str', 'device': 'str', 'size': 'size' } }
+
+##
+# @backup-access-setup:
+#
+# Set up snapshot access to VM drives for external backup provider.  No other
+# backup or backup access can be done before tearing down the backup access.
+#
+# @target-id: the ID of the external backup provider.
+#
+# @devlist: list of block device names (separated by ',', ';' or ':'). By
+#     default the backup includes all writable block devices.
+#
+# Returns: a list of @BackupAccessInfo, one for each device.
+#
+##
+{ 'command': 'backup-access-setup',
+  'data': { 'target-id': 'str', '*devlist': 'str' },
+  'returns': [ 'BackupAccessInfo' ], 'coroutine': true }
+
+##
+# @backup-access-teardown:
+#
+# Tear down previously setup snapshot access for the same provider.
+#
+# @target-id: the ID of the external backup provider.
+#
+##
+{ 'command': 'backup-access-teardown', 'data': { 'target-id': 'str' },
+  'coroutine': true }
+
 ##
 # @BlockDeviceTimedStats:
 #
diff --git a/system/runstate.c b/system/runstate.c
index d6ab860eca..7e641e4484 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -60,6 +60,7 @@
 #include "sysemu/sysemu.h"
 #include "sysemu/tpm.h"
 #include "trace.h"
+#include "pve-backup.h"
 
 static NotifierList exit_notifiers =
     NOTIFIER_LIST_INITIALIZER(exit_notifiers);
@@ -868,6 +869,11 @@ void qemu_cleanup(int status)
      * requests happening from here on anyway.
      */
     bdrv_drain_all_begin();
+    /*
+     * The backup access is set up by a QMP command, but is neither owned by a monitor nor
+     * associated to a BlockBackend. Need to tear it down manually here.
+     */
+    backup_access_teardown();
     job_cancel_sync_all();
     bdrv_close_all();
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu v3 09/34] PVE backup: implement bitmap support for external backup access
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (7 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 08/34] PVE backup: implement backup access setup and teardown API for external providers Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace Fiona Ebner
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

There can be one dirty bitmap for each backup target ID (which are
tracked in the backup_access_bitmaps hash table). The QMP user can
specify the ID of the bitmap it likes to use. This ID is then compared
to the current one for the given target. If they match, the bitmap is
re-used (should it still exist on the drive, otherwise re-created). If
there is a mismatch, the old bitmap is removed and a new one is
created.

The return value of the QMP command includes information about what
bitmap action was taken. Similar to what the query-backup QMP command
returns for regular backup. It also includes the bitmap name and
associated block node, so the management layer can then set up an NBD
export with the bitmap.

While the backup access is active, a background bitmap is also
required. This is necessary to implement bitmap handling according to
the original reference [0]. In particular:

- in the error case, new writes since the backup access was set up are
  in the background bitmap. Because of failure, the previously tracked
  writes from the backup access bitmap are still required too. Thus,
  the bitmap is merged with the background bitmap to get all new
  writes since the last backup.

- in the success case, continue tracking for the next incremental
  backup in the backup access bitmap. New writes since the backup
  access was set up are in the background bitmap. Because the backup
  was successfully, clear the backup access bitmap and merge back the
  background bitmap to get only the new writes.

Since QEMU cannot know if the backup was successful or not (except if
failure already happens during the setup QMP command), the management
layer needs to tell it via the teardown QMP command.

The bitmap action is also recorded in the device info now.

[0]: https://lore.kernel.org/qemu-devel/b68833dd-8864-4d72-7c61-c134a9835036@ya.ru/

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 pve-backup.c         | 175 ++++++++++++++++++++++++++++++++++++++++++-
 pve-backup.h         |   2 +-
 qapi/block-core.json |  22 +++++-
 system/runstate.c    |   2 +-
 4 files changed, 193 insertions(+), 8 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index d3370d6744..5f8dd396d5 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -15,6 +15,7 @@
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/cutils.h"
+#include "qemu/error-report.h"
 
 #if defined(CONFIG_MALLOC_TRIM)
 #include <malloc.h>
@@ -41,6 +42,7 @@
  */
 
 const char *PBS_BITMAP_NAME = "pbs-incremental-dirty-bitmap";
+const char *BACKGROUND_BITMAP_NAME = "backup-access-background-bitmap";
 
 static struct PVEBackupState {
     struct {
@@ -72,6 +74,7 @@ static struct PVEBackupState {
     CoMutex backup_mutex;
     CoMutex dump_callback_mutex;
     char *target_id;
+    GHashTable *backup_access_bitmaps; // key=target_id, value=bitmap_name
 } backup_state;
 
 static void pvebackup_init(void)
@@ -99,6 +102,8 @@ typedef struct PVEBackupDevInfo {
     char* device_name;
     int completed_ret; // INT_MAX if not completed
     BdrvDirtyBitmap *bitmap;
+    BdrvDirtyBitmap *background_bitmap; // used for external backup access
+    PBSBitmapAction bitmap_action;
     BlockDriverState *target;
     BlockJob *job;
 } PVEBackupDevInfo;
@@ -362,6 +367,67 @@ static void coroutine_fn pvebackup_co_complete_stream(void *opaque)
     qemu_co_mutex_unlock(&backup_state.backup_mutex);
 }
 
+/*
+ * New writes since the backup access was set up are in the background bitmap. Because of failure,
+ * the previously tracked writes in di->bitmap are still required too. Thus, merge with the
+ * background bitmap to get all new writes since the last backup.
+ */
+static void handle_backup_access_bitmaps_in_error_case(PVEBackupDevInfo *di)
+{
+    Error *local_err = NULL;
+
+    if (di->bs && di->background_bitmap) {
+        bdrv_drained_begin(di->bs);
+        if (di->bitmap) {
+            bdrv_enable_dirty_bitmap(di->bitmap);
+            if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, NULL, &local_err)) {
+                warn_report("backup access: %s - could not merge bitmaps in error path - %s",
+                            di->device_name,
+                            local_err ? error_get_pretty(local_err) : "unknown error");
+                /*
+                 * Could not merge, drop original bitmap too.
+                 */
+                bdrv_release_dirty_bitmap(di->bitmap);
+            }
+        } else {
+            warn_report("backup access: %s - expected bitmap not present", di->device_name);
+        }
+        bdrv_release_dirty_bitmap(di->background_bitmap);
+        bdrv_drained_end(di->bs);
+    }
+}
+
+/*
+ * Continue tracking for next incremental backup in di->bitmap. New writes since the backup access
+ * was set up are in the background bitmap. Because the backup was successful, clear di->bitmap and
+ * merge back the background bitmap to get only the new writes.
+ */
+static void handle_backup_access_bitmaps_after_success(PVEBackupDevInfo *di)
+{
+    Error *local_err = NULL;
+
+    if (di->bs && di->background_bitmap) {
+        bdrv_drained_begin(di->bs);
+        if (di->bitmap) {
+            bdrv_enable_dirty_bitmap(di->bitmap);
+            bdrv_clear_dirty_bitmap(di->bitmap, NULL);
+            if (!bdrv_merge_dirty_bitmap(di->bitmap, di->background_bitmap, NULL, &local_err)) {
+                warn_report("backup access: %s - could not merge bitmaps after backup - %s",
+                            di->device_name,
+                            local_err ? error_get_pretty(local_err) : "unknown error");
+                /*
+                 * Could not merge, drop original bitmap too.
+                 */
+                bdrv_release_dirty_bitmap(di->bitmap);
+            }
+        } else {
+            warn_report("backup access: %s - expected bitmap not present", di->device_name);
+        }
+        bdrv_release_dirty_bitmap(di->background_bitmap);
+        bdrv_drained_end(di->bs);
+    }
+}
+
 static void cleanup_snapshot_access(PVEBackupDevInfo *di)
 {
     if (di->fleecing.snapshot_access) {
@@ -602,6 +668,21 @@ static void setup_all_snapshot_access_bh(void *opaque)
 
         bdrv_drained_begin(di->bs);
 
+        if (di->bitmap) {
+            BdrvDirtyBitmap *background_bitmap =
+                bdrv_create_dirty_bitmap(di->bs, PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE,
+                                         BACKGROUND_BITMAP_NAME, &local_err);
+            if (!background_bitmap) {
+                error_setg(errp, "%s - creating background bitmap for backup access failed: %s",
+                           di->device_name,
+                           local_err ? error_get_pretty(local_err) : "unknown error");
+                bdrv_drained_end(di->bs);
+                break;
+            }
+            di->background_bitmap = background_bitmap;
+            bdrv_disable_dirty_bitmap(di->bitmap);
+        }
+
         if (setup_snapshot_access(di, &local_err) < 0) {
             cleanup_snapshot_access(di);
             bdrv_drained_end(di->bs);
@@ -850,6 +931,7 @@ err:
 BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
     const char *target_id,
     const char *devlist,
+    const char *bitmap_name,
     Error **errp)
 {
     assert(qemu_in_coroutine());
@@ -909,6 +991,77 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
         backup_state.stat.bitmap_list = NULL;
     }
 
+    if (!backup_state.backup_access_bitmaps) {
+        backup_state.backup_access_bitmaps =
+            g_hash_table_new_full(g_str_hash, g_str_equal, free, free);
+    }
+
+    /* create bitmaps if requested */
+    l = di_list;
+    while (l) {
+        PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
+        l = g_list_next(l);
+
+        di->block_size = PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE;
+
+        PBSBitmapAction action = PBS_BITMAP_ACTION_NOT_USED;
+        size_t dirty = di->size;
+
+        const char *old_bitmap_name =
+            (const char*)g_hash_table_lookup(backup_state.backup_access_bitmaps, target_id);
+
+        bool same_bitmap_name =
+            old_bitmap_name && bitmap_name && strcmp(bitmap_name, old_bitmap_name) == 0;
+
+        if (old_bitmap_name && !same_bitmap_name) {
+            BdrvDirtyBitmap *old_bitmap = bdrv_find_dirty_bitmap(di->bs, old_bitmap_name);
+            if (!old_bitmap) {
+                warn_report("setup backup access: expected old bitmap '%s' not found for drive "
+                            "'%s'", old_bitmap_name, di->device_name);
+            } else {
+                g_hash_table_remove(backup_state.backup_access_bitmaps, target_id);
+                bdrv_release_dirty_bitmap(old_bitmap);
+                action = PBS_BITMAP_ACTION_NOT_USED_REMOVED;
+            }
+        }
+
+        BdrvDirtyBitmap *bitmap = NULL;
+        if (bitmap_name) {
+            bitmap = bdrv_find_dirty_bitmap(di->bs, bitmap_name);
+            if (!bitmap) {
+                bitmap = bdrv_create_dirty_bitmap(di->bs, PROXMOX_BACKUP_DEFAULT_CHUNK_SIZE,
+                                                  bitmap_name, errp);
+                if (!bitmap) {
+                    qemu_mutex_unlock(&backup_state.stat.lock);
+                    goto err;
+                }
+                bdrv_set_dirty_bitmap(bitmap, 0, di->size);
+                action = same_bitmap_name ? PBS_BITMAP_ACTION_INVALID : PBS_BITMAP_ACTION_NEW;
+            } else {
+                /* track clean chunks as reused */
+                dirty = MIN(bdrv_get_dirty_count(bitmap), di->size);
+                backup_state.stat.reused += di->size - dirty;
+                action = PBS_BITMAP_ACTION_USED;
+            }
+
+            if (!same_bitmap_name) {
+                g_hash_table_insert(backup_state.backup_access_bitmaps,
+                                    strdup(target_id), strdup(bitmap_name));
+            }
+
+        }
+
+        PBSBitmapInfo *info = g_malloc(sizeof(*info));
+        info->drive = g_strdup(di->device_name);
+        info->action = action;
+        info->size = di->size;
+        info->dirty = dirty;
+        backup_state.stat.bitmap_list = g_list_append(backup_state.stat.bitmap_list, info);
+
+        di->bitmap = bitmap;
+        di->bitmap_action = action;
+    }
+
     /* initialize global backup_state now */
 
     if (backup_state.stat.error) {
@@ -978,6 +1131,12 @@ BackupAccessInfoList *coroutine_fn qmp_backup_access_setup(
         info->value->node_name = g_strdup(bdrv_get_node_name(di->fleecing.snapshot_access));
         info->value->device = g_strdup(di->device_name);
         info->value->size = di->size;
+        if (bitmap_name) {
+            info->value->bitmap_node_name = g_strdup(bdrv_get_node_name(di->bs));
+            info->value->bitmap_name = g_strdup(bitmap_name);
+            info->value->bitmap_action = di->bitmap_action;
+            info->value->has_bitmap_action = true;
+        }
 
         *p_bai_next = info;
         p_bai_next = &info->next;
@@ -992,6 +1151,8 @@ err:
         PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data;
         l = g_list_next(l);
 
+        handle_backup_access_bitmaps_in_error_case(di);
+
         g_free(di->device_name);
         di->device_name = NULL;
 
@@ -1007,7 +1168,7 @@ err:
 /*
  * Caller needs to hold the backup mutex or the BQL.
  */
-void backup_access_teardown(void)
+void backup_access_teardown(bool success)
 {
     GList *l = backup_state.di_list;
 
@@ -1024,6 +1185,12 @@ void backup_access_teardown(void)
             di->fleecing.cbw = NULL;
         }
 
+        if (success) {
+            handle_backup_access_bitmaps_after_success(di);
+        } else {
+            handle_backup_access_bitmaps_in_error_case(di);
+        }
+
         g_free(di->device_name);
         di->device_name = NULL;
 
@@ -1039,13 +1206,13 @@ static void backup_access_teardown_bh(void *opaque)
 {
     CoCtxData *data = (CoCtxData*)opaque;
 
-    backup_access_teardown();
+    backup_access_teardown(*((bool*)data->data));
 
     /* return */
     aio_co_enter(data->ctx, data->co);
 }
 
-void coroutine_fn qmp_backup_access_teardown(const char *target_id, Error **errp)
+void coroutine_fn qmp_backup_access_teardown(const char *target_id, bool success, Error **errp)
 {
     assert(qemu_in_coroutine());
 
@@ -1075,6 +1242,7 @@ void coroutine_fn qmp_backup_access_teardown(const char *target_id, Error **errp
     CoCtxData waker = {
         .co = qemu_coroutine_self(),
         .ctx = qemu_get_current_aio_context(),
+        .data = &success,
     };
     aio_bh_schedule_oneshot(waker.ctx, backup_access_teardown_bh, &waker);
     qemu_coroutine_yield();
@@ -1284,6 +1452,7 @@ UuidInfo coroutine_fn *qmp_backup(
             }
 
             di->dev_id = dev_id;
+            di->bitmap_action = action;
 
             PBSBitmapInfo *info = g_malloc(sizeof(*info));
             info->drive = g_strdup(di->device_name);
diff --git a/pve-backup.h b/pve-backup.h
index 4033bc848f..9ebeef7c8f 100644
--- a/pve-backup.h
+++ b/pve-backup.h
@@ -11,6 +11,6 @@
 #ifndef PVE_BACKUP_H
 #define PVE_BACKUP_H
 
-void backup_access_teardown(void);
+void backup_access_teardown(bool success);
 
 #endif /* PVE_BACKUP_H */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 68f8da3144..2de777c86b 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1110,9 +1110,17 @@
 #
 # @size: the size of the block device in bytes.
 #
+# @bitmap-node-name: the block node name the dirty bitmap is associated to.
+#
+# @bitmap-name: the name of the dirty bitmap associated to the backup access.
+#
+# @bitmap-action: the action taken on the dirty bitmap.
+#
 ##
 { 'struct': 'BackupAccessInfo',
-  'data': { 'node-name': 'str', 'device': 'str', 'size': 'size' } }
+  'data': { 'node-name': 'str', 'device': 'str', 'size': 'size',
+            '*bitmap-node-name': 'str', '*bitmap-name': 'str',
+            '*bitmap-action': 'PBSBitmapAction' } }
 
 ##
 # @backup-access-setup:
@@ -1125,11 +1133,16 @@
 # @devlist: list of block device names (separated by ',', ';' or ':'). By
 #     default the backup includes all writable block devices.
 #
+# @bitmap-name: use/create a bitmap with this name. Re-using the same name
+#     allows for making incremental backups. Check the @bitmap-action in the
+#     result to see if you can actually re-use the bitmap or if it had to be
+#     newly created.
+#
 # Returns: a list of @BackupAccessInfo, one for each device.
 #
 ##
 { 'command': 'backup-access-setup',
-  'data': { 'target-id': 'str', '*devlist': 'str' },
+  'data': { 'target-id': 'str', '*devlist': 'str', '*bitmap-name': 'str' },
   'returns': [ 'BackupAccessInfo' ], 'coroutine': true }
 
 ##
@@ -1139,8 +1152,11 @@
 #
 # @target-id: the ID of the external backup provider.
 #
+# @success: whether the backup done by the external provider was successful.
+#
 ##
-{ 'command': 'backup-access-teardown', 'data': { 'target-id': 'str' },
+{ 'command': 'backup-access-teardown',
+  'data': { 'target-id': 'str', 'success': 'bool' },
   'coroutine': true }
 
 ##
diff --git a/system/runstate.c b/system/runstate.c
index 7e641e4484..b61996dd7a 100644
--- a/system/runstate.c
+++ b/system/runstate.c
@@ -873,7 +873,7 @@ void qemu_cleanup(int status)
      * The backup access is set up by a QMP command, but is neither owned by a monitor nor
      * associated to a BlockBackend. Need to tear it down manually here.
      */
-    backup_access_teardown();
+    backup_access_teardown(false);
     job_cancel_sync_all();
     bdrv_close_all();
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (8 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 09/34] PVE backup: implement bitmap support for external backup access Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-11 18:33   ` Thomas Lamprecht
  2024-11-12 14:20   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [RFC storage v3 11/34] add storage_has_feature() helper function Fiona Ebner
                   ` (24 subsequent siblings)
  34 siblings, 2 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

The first use case is running the container backup subroutine for
external providers inside a user namespace. That allows them to see
the filesystem to back-up from the containers perspective and also
improves security because of isolation.

Copied and adapted the relevant parts from the pve-buildpkg
repository.

Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
[FE: add $idmap parameter, drop $aux_groups parameter]
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/Makefile   |   1 +
 src/PVE/Env.pm | 136 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 137 insertions(+)
 create mode 100644 src/PVE/Env.pm

diff --git a/src/Makefile b/src/Makefile
index 2d8bdc4..dba26e3 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -15,6 +15,7 @@ LIB_SOURCES = \
 	Certificate.pm \
 	CpuSet.pm \
 	Daemon.pm \
+	Env.pm \
 	Exception.pm \
 	Format.pm \
 	INotify.pm \
diff --git a/src/PVE/Env.pm b/src/PVE/Env.pm
new file mode 100644
index 0000000..e11bec0
--- /dev/null
+++ b/src/PVE/Env.pm
@@ -0,0 +1,136 @@
+package PVE::Env;
+
+use strict;
+use warnings;
+
+use Fcntl qw(O_WRONLY);
+use POSIX qw(EINTR);
+use Socket;
+
+require qw(syscall.ph);
+
+use constant {CLONE_NEWNS   => 0x00020000,
+              CLONE_NEWUSER => 0x10000000};
+
+sub unshare($) {
+    my ($flags) = @_;
+    return 0 == syscall(272, $flags);
+}
+
+sub __set_id_map($$$) {
+    my ($pid, $what, $value) = @_;
+    sysopen(my $fd, "/proc/$pid/${what}_map", O_WRONLY)
+	or die "failed to open child process' ${what}_map\n";
+    my $rc = syswrite($fd, $value);
+    if (!$rc || $rc != length($value)) {
+	die "failed to set sub$what: $!\n";
+    }
+    close($fd);
+}
+
+sub set_id_map($$) {
+    my ($pid, $id_map) = @_;
+
+    my $gid_map = '';
+    my $uid_map = '';
+
+    for my $map ($id_map->@*) {
+	my ($type, $ct, $host, $length) = $map->@*;
+
+	$gid_map .= "$ct $host $length\n" if $type eq 'g';
+	$uid_map .= "$ct $host $length\n" if $type eq 'u';
+    }
+
+    __set_id_map($pid, 'gid', $gid_map) if $gid_map;
+    __set_id_map($pid, 'uid', $uid_map) if $uid_map;
+}
+
+sub wait_for_child($;$) {
+    my ($pid, $noerr) = @_;
+    my $interrupts = 0;
+    while (waitpid($pid, 0) != $pid) {
+	if ($! == EINTR) {
+	    warn "interrupted...\n";
+	    kill(($interrupts > 3 ? 9 : 15), $pid);
+	    $interrupts++;
+	}
+    }
+    my $status = POSIX::WEXITSTATUS($?);
+    return $status if $noerr;
+
+    if ($? == -1) {
+	die "failed to execute\n";
+    } elsif (POSIX::WIFSIGNALED($?)) {
+	my $sig = POSIX::WTERMSIG($?);
+	die "got signal $sig\n";
+    } elsif ($status != 0) {
+	warn "exit code $status\n";
+    }
+    return $status;
+}
+
+sub forked(&%) {
+    my ($code, %opts) = @_;
+
+    pipe(my $except_r, my $except_w) or die "pipe: $!\n";
+
+    my $pid = fork();
+    die "fork failed: $!\n" if !defined($pid);
+
+    if ($pid == 0) {
+	close($except_r);
+	eval { $code->() };
+	if ($@) {
+	    print {$except_w} $@;
+	    $except_w->flush();
+	    POSIX::_exit(1);
+	}
+	POSIX::_exit(0);
+    }
+    close($except_w);
+
+    my $err;
+    if (my $afterfork = $opts{afterfork}) {
+	eval { $afterfork->($pid); };
+	if ($err = $@) {
+	    kill(15, $pid);
+	    $opts{noerr} = 1;
+	}
+    }
+    if (!$err) {
+	$err = do { local $/ = undef; <$except_r> };
+    }
+    my $rv = wait_for_child($pid, $opts{noerr});
+    die $err if $err;
+    die "an unknown error occurred\n" if $rv != 0;
+    return $rv;
+}
+
+sub run_in_userns(&;$) {
+    my ($code, $id_map) = @_;
+    socketpair(my $sp, my $sc, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
+	or die "socketpair: $!\n";
+    forked(sub {
+	close($sp);
+	unshare(CLONE_NEWUSER|CLONE_NEWNS) or die "unshare(NEWUSER|NEWNS): $!\n";
+	syswrite($sc, "1\n") == 2 or die "write: $!\n";
+	shutdown($sc, 1);
+	my $two = <$sc>;
+	die "failed to sync with parent process\n" if $two ne "2\n";
+	close($sc);
+	$! = undef;
+	($(, $)) = (0, 0); die "$!\n" if $!;
+	($<, $>) = (0, 0); die "$!\n" if $!;
+	$code->();
+    }, afterfork => sub {
+	my ($pid) = @_;
+	close($sc);
+	my $one = <$sp>;
+	die "failed to sync with userprocess\n" if $one ne "1\n";
+	set_id_map($pid, $id_map);
+	syswrite($sp, "2\n") == 2 or die "write: $!\n";
+	close($sp);
+    });
+}
+
+1;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC storage v3 11/34] add storage_has_feature() helper function
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (9 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC storage v3 12/34] plugin: introduce new_backup_provider() method Fiona Ebner
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Which looks up whether a storage supports a given feature in its
'plugindata'. This is intentionally kept simple and not implemented
as a plugin method for now. Should it ever become more complex
requiring plugins to override the default implementation, it can
later be changed to a method.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/PVE/Storage.pm        |  8 ++++++++
 src/PVE/Storage/Plugin.pm | 10 ++++++++++
 2 files changed, 18 insertions(+)

diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index 57b2038..e251056 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -204,6 +204,14 @@ sub storage_check_enabled {
     return storage_check_node($cfg, $storeid, $node, $noerr);
 }
 
+sub storage_has_feature {
+    my ($cfg, $storeid, $feature) = @_;
+
+    my $scfg = storage_config($cfg, $storeid);
+
+    return PVE::Storage::Plugin::storage_has_feature($scfg->{type}, $feature);
+}
+
 # storage_can_replicate:
 # return true if storage supports replication
 # (volumes allocated with vdisk_alloc() has replication feature)
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 8cc693c..6071e45 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -244,6 +244,16 @@ sub dirs_hash_to_string {
     return join(',', map { "$_=$hash->{$_}" } sort keys %$hash);
 }
 
+sub storage_has_feature {
+    my ($type, $feature) = @_;
+
+    my $data = $defaultData->{plugindata}->{$type};
+    if (my $features = $data->{features}) {
+	return $features->{$feature};
+    }
+    return;
+}
+
 sub default_format {
     my ($scfg) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC storage v3 12/34] plugin: introduce new_backup_provider() method
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (10 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC storage v3 11/34] add storage_has_feature() helper function Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC storage v3 13/34] extract backup config: delegate to backup provider for storages that support it Fiona Ebner
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

The new_backup_provider() method can be used by storage plugins for
external backup providers. If the method returns a provider, Proxmox
VE will use callbacks to that provider for backups and restore instead
of using its usual backup/restore mechanisms.

API age and version are both bumped.

The backup provider API is split into two parts, both of which again
need different implementations for VM and LXC guests:

1. Backup API

There are two hook callback functions, namely:
1. job_hook() is called during the start/end/abort phases of the whole
   backup job.
2. backup_hook() is called during the start/end/abort phases of the
   backup of an individual guest. There also is a 'prepare' phase
   useful for container backups, because the backup method for
   containers itself is executed in the user namespace context
   associated to the container.

The backup_get_mechanism() method is used to decide on the backup
mechanism. Currently, 'block-device' or 'nbd' for VMs, and 'directory'
for containers is possible. The method also let's the plugin indicate
whether to use a bitmap for incremental VM backup or not. It is enough
to implement one mechanism for VMs and one mechanism for containers.

Next, there are methods for backing up the guest's configuration and
data, backup_vm() for VM backup and backup_container() for container
backup, with the latter running

Finally, some helpers like getting the provider name or volume ID for
the backup target, as well as for handling the backup log.

1.1 Backup Mechanisms

VM:

Access to the data on the VM's disk from the time the backup started
is made available via a so-called "snapshot access". This is either
the full image, or in case a bitmap is used, the dirty parts of the
image since the last time the bitmap was used for a successful backup.
Reading outside of the dirty parts will result in an error. After
backing up each part of the disk, it should be discarded in the export
to avoid unnecessary space usage on the Proxmox VE side (there is an
associated fleecing image).

VM mechanism 'block-device':

The snapshot access is exposed as a block device. If used, a bitmap is
passed along.

VM mechanism 'nbd':

The snapshot access and, if used, bitmap are exported via NBD.

Container mechanism 'directory':

A copy or snapshot of the container's filesystem state is made
available as a directory. The method is executed inside the user
namespace associated to the container.

2. Restore API

The restore_get_mechanism() method is used to decide on the restore
mechanism. Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for
containers are possible. It is enough to implement one mechanism for
VMs and one mechanism for containers.

Next, methods for extracting the guest and firewall configuration and
the implementations of the restore mechanism via a pair of methods: an
init method, for making the data available to Proxmox VE and a cleanup
method that is called after restore.

For VMs, there also is a restore_vm_get_device_info() helper required,
to get the disks included in the backup and their sizes.

2.1. Restore Mechanisms

VM mechanism 'qemu-img':

The backup provider gives a path to the disk image that will be
restored. The path needs to be something 'qemu-img' can deal with,
e.g. can also be an NBD URI or similar.

Container mechanism 'directory':

The backup provider gives the path to a directory with the full
filesystem structure of the container.

Container mechanism 'tar':

The backup provider gives the path to a (potentially compressed) tar
archive with the full filesystem structure of the container.

See the PVE::BackupProvider::Plugin module for the full API
documentation.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* update docs regarding API changes:
  - prepare phase for backup hook
  - pass in configs as data instead of filenames

 src/PVE/BackupProvider/Makefile        |    3 +
 src/PVE/BackupProvider/Plugin/Base.pm  | 1158 ++++++++++++++++++++++++
 src/PVE/BackupProvider/Plugin/Makefile |    5 +
 src/PVE/Makefile                       |    1 +
 src/PVE/Storage.pm                     |   12 +-
 src/PVE/Storage/Plugin.pm              |   15 +
 6 files changed, 1192 insertions(+), 2 deletions(-)
 create mode 100644 src/PVE/BackupProvider/Makefile
 create mode 100644 src/PVE/BackupProvider/Plugin/Base.pm
 create mode 100644 src/PVE/BackupProvider/Plugin/Makefile

diff --git a/src/PVE/BackupProvider/Makefile b/src/PVE/BackupProvider/Makefile
new file mode 100644
index 0000000..f018cef
--- /dev/null
+++ b/src/PVE/BackupProvider/Makefile
@@ -0,0 +1,3 @@
+.PHONY: install
+install:
+	make -C Plugin install
diff --git a/src/PVE/BackupProvider/Plugin/Base.pm b/src/PVE/BackupProvider/Plugin/Base.pm
new file mode 100644
index 0000000..a8d0a88
--- /dev/null
+++ b/src/PVE/BackupProvider/Plugin/Base.pm
@@ -0,0 +1,1158 @@
+package PVE::BackupProvider::Plugin::Base;
+
+use strict;
+use warnings;
+
+=pod
+
+=head1 NAME
+
+PVE::BackupProvider::Plugin::Base - Base Plugin for Backup Provider API
+
+=head1 SYNOPSIS
+
+    use base qw(PVE::BackupProvider::Plugin::Base);
+
+=head1 DESCRIPTION
+
+This module serves as the base for any module implementing the API that Proxmox
+VE uses to interface with external backup providers. The API is used for
+creating and restoring backups. A backup provider also needs to provide a
+storage plugin for integration with the front-end. The API here is used by the
+backup stack in the backend.
+
+1. Backup API
+
+There are two hook callback functions, namely:
+
+=over
+
+=item C<job_hook()>
+
+Called during the start/end/abort phases of the whole backup job.
+
+=item C<backup_hook()>
+
+Called during the start/end/abort phases of the backup of an
+individual guest.
+
+=back
+
+The backup_get_mechanism() method is used to decide on the backup mechanism.
+Currently, 'block-device' or 'nbd' for VMs, and 'directory' for containers is
+possible. The method also let's the plugin indicate whether to use a bitmap for
+incremental VM backup or not. It is enough to implement one mechanism for VMs
+and one mechanism for containers.
+
+Next, there are methods for backing up the guest's configuration and data,
+backup_vm() for VM backup and backup_container() for container backup.
+
+Finally, some helpers like getting the provider name or volume ID for the backup
+target, as well as for handling the backup log.
+
+1.1 Backup Mechanisms
+
+VM:
+
+Access to the data on the VM's disk from the time the backup started is made
+available via a so-called "snapshot access". This is either the full image, or
+in case a bitmap is used, the dirty parts of the image since the last time the
+bitmap was used for a successful backup. Reading outside of the dirty parts will
+result in an error. After backing up each part of the disk, it should be
+discarded in the export to avoid unnecessary space usage on the Proxmox VE side
+(there is an associated fleecing image).
+
+VM mechanism 'block-device':
+
+The snapshot access is exposed as a block device. If used, a bitmap is passed
+along.
+
+VM mechanism 'nbd':
+
+The snapshot access and, if used, bitmap are exported via NBD.
+
+Container mechanism 'directory':
+
+A copy or snapshot of the container's filesystem state is made available as a
+directory.
+
+2. Restore API
+
+The restore_get_mechanism() method is used to decide on the restore mechanism.
+Currently, 'qemu-img' for VMs, and 'directory' or 'tar' for containers are
+possible. It is enough to implement one mechanism for VMs and one mechanism for
+containers.
+
+Next, methods for extracting the guest and firewall configuration and the
+implementations of the restore mechanism via a pair of methods: an init method,
+for making the data available to Proxmox VE and a cleanup method that is called
+after restore.
+
+For VMs, there also is a restore_vm_get_device_info() helper required, to get
+the disks included in the backup and their sizes.
+
+2.1. Restore Mechanisms
+
+VM mechanism 'qemu-img':
+
+The backup provider gives a path to the disk image that will be restored. The
+path needs to be something 'qemu-img' can deal with, e.g. can also be an NBD URI
+or similar.
+
+Container mechanism 'directory':
+
+The backup provider gives the path to a directory with the full filesystem
+structure of the container.
+
+Container mechanism 'tar':
+
+The backup provider gives the path to a (potentially compressed) tar archive
+with the full filesystem structure of the container.
+
+=head1 METHODS
+
+=cut
+
+# plugin methods
+
+=pod
+
+=over
+
+=item C<new>
+
+The constructor. Returns a blessed instance of the backup provider class.
+
+Parameters:
+
+=over
+
+=item C<$storage_plugin>
+
+The associated storage plugin class.
+
+=item C<$scfg>
+
+The storage configuration of the associated storage.
+
+=item C<$storeid>
+
+The storage ID of the associated storage.
+
+=item C<$log_function>
+
+The function signature is C<$log_function($log_level, $message)>. This log
+function can be used to write to the backup task log in Proxmox VE.
+
+=over
+
+=item C<$log_level>
+
+Either C<info>, C<warn> or C<err> for informational messages, warnings or error
+messages.
+
+=item C<$message>
+
+The message to be printed.
+
+=back
+
+=back
+
+=back
+
+=cut
+sub new {
+    my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<provider_name>
+
+Returns the name of the backup provider. It will be printed in some log lines.
+
+=back
+
+=cut
+sub provider_name {
+    my ($self) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<job_hook>
+
+The job hook function. Is called during various phases of the backup job.
+Intended for doing preparations and cleanup. In the future, additional phases
+might get added, so it's best to ignore an unknown phase.
+
+Parameters:
+
+=over
+
+=item C<$phase>
+
+The phase during which the function is called.
+
+=over
+
+=item C<start>
+
+When the job starts, before the first backup is made.
+
+=item C<end>
+
+When the job ends, after all backups are finished, even if some backups
+failed.
+
+=item C<abort>
+
+When the job is aborted (e.g. interrupted by signal, other fundamental failure).
+
+=back
+
+=item C<$info>
+
+A hash reference containing additional parameters depending on the C<$phase>:
+
+=over
+
+=item C<start>
+
+=over
+
+=item C<< $info->{'start-time'} >>
+
+Unix time-stamp of when the job started.
+
+=back
+
+=item C<end>
+
+No additional information.
+
+=item C<abort>
+
+=over
+
+=item C<< $info->{error} >>
+
+The error message indicating the failure.
+
+=back
+
+=back
+
+=back
+
+=back
+
+=cut
+sub job_hook {
+    my ($self, $phase, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<backup_hook>
+
+The backup hook function. Is called during various phases during the backup of a
+given guest. Intended for doing preparations and cleanup. In the future,
+additional phases might get added, so it's best to ignore an unknown phase.
+
+Parameters:
+
+=over
+
+=item C<$phase>
+
+The phase during which the function is called.
+
+=over
+
+=item C<start>
+
+Before the backup of the given guest is made.
+
+=item C<prepare>
+
+Right before C<backup_container()> is called. The method C<backup_container()>
+is called as the ID-mapped root user of the container, so as a potentially
+unprivileged user. The hook is still called as a privileged user to allow for
+the necessary preparation.
+
+=item C<end>
+
+After the backup of the given guest finished successfully.
+
+=item C<abort>
+
+After the backup of the given guest encountered an error or was aborted.
+
+=back
+
+=item C<$vmid>
+
+The ID of the guest being backed up.
+
+=item C<$vmtype>
+
+The type of the guest being backed up. Currently, either C<qemu> or C<lxc>.
+Might be C<undef> in phase C<abort> for certain error scenarios.
+
+=item C<$info>
+
+A hash reference containing additional parameters depending on the C<$phase>:
+
+=over
+
+=item C<start>
+
+=over
+
+=item C<< $info->{'start-time'} >>
+
+Unix time-stamp of when the guest backup started.
+
+=back
+
+=item C<prepare>
+
+The same information that's passed along to C<backup_container()>, see the
+description there.
+
+=item C<end>
+
+No additional information.
+
+=item C<abort>
+
+=over
+
+=item C<< $info->{error} >>
+
+The error message indicating the failure.
+
+=back
+
+=back
+
+=back
+
+=back
+
+=cut
+sub backup_hook {
+    my ($self, $phase, $vmid, $vmtype, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<backup_get_mechanism>
+
+Tell the caller what mechanism to use for backing up the guest. The backup
+method for the guest, i.e. C<backup_vm> for guest type C<qemu> or
+C<backup_container> for guest type C<lxc>, will later be called with
+mechanism-specific information. See those methods for more information. Returns
+C<($mechanism, $bitmap_id)>:
+
+=over
+
+=item C<$mechanism>
+
+Currently C<nbd> and C<block-device> for guest type C<qemu> and C<directory>
+for guest type C<lxc> are possible. If there is no support for one of the guest
+types, the method should either C<die> or return C<undef>.
+
+=item C<$bitmap_id>
+
+If the backup provider supports backing up with a bitmap, the ID of the bitmap
+to use. Return C<undef> otherwise. Re-use the same ID multiple times for
+incremental backup.
+
+=back
+
+Parameters:
+
+=over
+
+=item C<$vmid>
+
+The ID of the guest being backed up.
+
+=item C<$vmtype>
+
+The type of the guest being backed up. Currently, either C<qemu> or C<lxc>.
+
+=back
+
+=back
+
+=cut
+sub backup_get_mechanism {
+    my ($self, $vmid, $vmtype) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<backup_get_archive_name>
+
+The archive name of the backup archive that will be created by the current
+backup. The returned value needs to be the volume name that the archive can
+later be accessed by via the corresponding storage plugin, i.e. C<$archive_name>
+in the volume ID C<"${storeid}:backup/${archive_name}">.
+
+Parameters:
+
+=over
+
+=item C<$vmid>
+
+The ID of the guest being backed up.
+
+=item C<$vmtype>
+
+The type of the guest being backed up. Currently, either C<qemu> or C<lxc>.
+
+=item C<$backup_time>
+
+Unix time-stamp of when the guest backup started.
+
+=back
+
+=back
+
+=cut
+sub backup_get_archive_name {
+    my ($self, $vmid, $vmtype, $backup_time) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<backup_get_task_size>
+
+Returns the size of the backup after completion.
+
+Parameters:
+
+=over
+
+=item C<$vmid>
+
+The ID of the guest being backed up.
+
+=back
+
+=back
+
+=cut
+sub backup_get_task_size {
+    my ($self, $vmid) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<backup_handle_log_file>
+
+Handle the backup's log file which contains the task log for the backup. For
+example, a provider might want to upload a copy to the backup server.
+
+Parameters:
+
+=over
+
+=item C<$vmid>
+
+The ID of the guest being backed up.
+
+=item C<$filename>
+
+Path to the file with the backup log.
+
+=back
+
+=back
+
+=cut
+sub backup_handle_log_file {
+    my ($self, $vmid, $filename) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<backup_vm>
+
+Used when the guest type is C<qemu>. Back up the virtual machine's configuration
+and volumes that were made available according to the mechanism returned by
+C<backup_get_mechanism>. Returns when done backing up. Ideally, the method
+should log the progress during backup.
+
+Parameters:
+
+=over
+
+=item C<$vmid>
+
+The ID of the guest being backed up.
+
+=item C<$guest_config>
+
+The guest configuration as raw data.
+
+=item C<$volumes>
+
+Hash reference with information about the VM's volumes. Some parameters are
+mechanism-specific.
+
+=over
+
+=item C<< $volumes->{$devicename} >>
+
+Hash reference with information about the VM volume associated to
+the device C<$devicename>. The device name needs to be remembered for restoring.
+The device name is also the name of the NBD export when the C<nbd> mechanism is
+used.
+
+=item C<< $volumes->{$devicename}->{size} >>
+
+Size of the volume in bytes.
+
+=item C<< $volumes->{$devicename}->{'bitmap-mode'} >>
+
+How a bitmap is used for the current volume.
+
+=over
+
+=item C<none>
+
+No bitmap is used.
+
+=item C<new>
+
+A bitmap has been newly created on the volume.
+
+=item C<reuse>
+
+The bitmap with the same ID as requested is being re-used.
+
+=back
+
+=back
+
+Mechansims-specific parameters for mechanism:
+
+=over
+
+=item C<block-device>
+
+=over
+
+=item C<< $volumes->{$devicename}->{path} >>
+
+Path to the block device with the backup data.
+
+=item C<< $volumes->{$devicename}->{'next-dirty-region'} >>
+
+A function that will return the offset and length of the next dirty region as a
+two-element list. After the last dirty region, it will return C<undef>. If no
+bitmap is used, it will return C<(0, $size)> and then C<undef>. If a bitmap is
+used, these are the dirty regions according to the bitmap.
+
+=back
+
+=item C<nbd>
+
+=over
+
+=item C<< $volumes->{$devicename}->{'nbd-path'} >>
+
+The path to the Unix socket providing the NBD export with the backup data and,
+if a bitmap is used, bitmap data.
+
+=item C<< $volumes->{$devicename}->{'bitmap-name'} >>
+
+The name of the bitmap in case a bitmap is used.
+
+=back
+
+=back
+
+=item C<$info>
+
+A hash reference containing optional parameters.
+
+Optional parameters:
+
+=over
+
+=item C<< $info->{'bandwidth-limit'} >>
+
+The requested bandwith limit. The value is in bytes/second. The backup provider
+is expected to honor this rate limit for IO on the backup source and network
+traffic. A value of C<0>, C<undef> or if there is no such key in the hash all
+mean that there is no limit.
+
+=item C<< $info->{'firewall-config'} >>
+
+Present if the firewall configuration exists. The guest's firewall
+configuration as raw data.
+
+=back
+
+=back
+
+=back
+
+=cut
+sub backup_vm {
+    my ($self, $vmid, $guest_config, $volumes, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<backup_container>
+
+Used when the guest type is C<lxc>. Back up the container filesystem structure
+that is made available for the mechanism returned by C<backup_get_mechanism>.
+Returns when done backing up. Ideally, the method should log the progress during
+backup.
+
+Note that this function is executed as the ID-mapped root user of the container,
+so a potentially unprivileged user. The ID is passed along as part of C<$info>.
+Use the C<prepare> phase of the C<backup_hook> for preparation. For example, to
+make credentials available to the potentially unprivileged user.
+
+Parameters:
+
+=over
+
+=item C<$vmid>
+
+The ID of the guest being backed up.
+
+=item C<$guest_config>
+
+Guest configuration as raw data.
+
+=item C<$exclude_patterns>
+
+A list of glob patterns of files and directories to be excluded. C<**> is used
+to match current directory and subdirectories. See also the following (note
+that PBS implements more than required here, like explicit inclusion when
+starting with a C<!>):
+L<vzdump documentation|https://pve.proxmox.com/pve-docs/chapter-vzdump.html#_file_exclusions>
+and
+L<PBS documentation|https://pbs.proxmox.com/docs/backup-client.html#excluding-files-directories-from-a-backup>
+
+=item C<$info>
+
+A hash reference containing optional and mechanism-specific parameters.
+
+Optional parameters:
+
+=over
+
+=item C<< $info->{'bandwidth-limit'} >>
+
+The requested bandwith limit. The value is in bytes/second. The backup provider
+is expected to honor this rate limit for IO on the backup source and network
+traffic. A value of C<0>, C<undef> or if there is no such key in the hash all
+mean that there is no limit.
+
+=item C<< $info->{'firewall-config'} >>
+
+Present if the firewall configuration exists. The guest's firewall
+configuration as raw data.
+
+=back
+
+Mechansims-specific parameters for mechanism:
+
+=over
+
+=item C<directory>
+
+=over
+
+=item C<< $info->{directory} >>
+
+Path to the directory with the container's file system structure.
+
+=item C<< $info->{sources} >>
+
+List of paths (for separate mount points, including "." for the root) inside the
+directory to be backed up.
+
+=item C<< $info->{'backup-user-id'} >>
+
+The user ID of the ID-mapped root user of the container. For example, C<100000>
+for unprivileged containers by default.
+
+=back
+
+=back
+
+=back
+
+=back
+
+=cut
+sub backup_container {
+    my ($self, $vmid, $guest_config, $exclude_patterns, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_get_mechanism>
+
+Tell the caller what mechanism to use for restoring the guest. The restore
+methods for the guest, i.e. C<restore_qemu_img_init> and
+C<restore_qemu_img_cleanup> for guest type C<qemu>, or C<restore_container_init>
+and C<restore_container_cleanup> for guest type C<lxc> will be called with
+mechanism-specific information and their return value might also depend on the
+mechanism. See those methods for more information. Returns
+C<($mechanism, $vmtype)>:
+
+=over
+
+=item C<$mechanism>
+
+Currently, C<'qemu-img'> for guest type C<'qemu'> and either C<'tar'> or
+C<'directory'> for type C<'lxc'> are possible.
+
+=item C<$vmtype>
+
+Either C<qemu> or C<lxc> depending on what type the guest in the backed-up
+archive is.
+
+=back
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=back
+
+=back
+
+=cut
+sub restore_get_mechanism {
+    my ($self, $volname, $storeid) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_get_guest_config>
+
+Extract the guest configuration from the given backup. Returns the raw contents
+of the backed-up configuration file.
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=back
+
+=back
+
+=cut
+sub restore_get_guest_config {
+    my ($self, $volname, $storeid) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_get_firewall_config>
+
+Extract the guest's firewall configuration from the given backup. Returns the
+raw contents of the backed-up configuration file. Returns C<undef> if there is
+no firewall config in the archive, C<die> if the configuration can't be
+extracted.
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=back
+
+=back
+
+=cut
+sub restore_get_firewall_config {
+    my ($self, $volname, $storeid) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_vm_init>
+
+Prepare a VM archive for restore. Returns the basic information about the
+volumes in the backup as a hash reference with the following structure:
+
+    {
+	$devicenameA => { size => $sizeA },
+	$devicenameB => { size => $sizeB },
+	...
+    }
+
+=over
+
+=item C<$devicename>
+
+The device name that was given as an argument to the backup routine when the
+backup was created.
+
+=item C<$size>
+
+The virtual size of the VM volume that was backed up. A volume with this size is
+created for the restore operation. In particular, for the C<qemu-img> mechanism,
+this should be the size of the block device referenced by the C<qemu-img-path>
+returned by C<restore_vm_volume>.
+
+=back
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=back
+
+=back
+
+=cut
+sub restore_vm_init {
+    my ($self, $volname, $storeid) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_vm_cleanup>
+
+For VM backups, clean up after the restore. Called in both, success and
+failure scenarios.
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=back
+
+=back
+
+=cut
+sub restore_vm_cleanup {
+    my ($self, $volname, $storeid) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_vm_volume_init>
+
+Prepare a VM volume in the archive for restore. Returns a hash reference with
+the mechanism-specific information for the restore:
+
+=over
+
+=item C<qemu-img>
+
+    { 'qemu-img-path' => $path }
+
+The volume will be restored using the C<qemu-img convert> command.
+
+=over
+
+=item C<$path>
+
+A path to the volume that C<qemu-img> can use as a source for the
+C<qemu-img convert> command. E.g. this could also be an NBD URI.
+
+=back
+
+=back
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=item C<$devicename>
+
+The device name associated to the volume that should be prepared for the
+restore. Same as the argument to the backup routine when the backup was created.
+
+=item C<$info>
+
+A hash reference with optional and mechanism-specific parameters. Currently
+empty.
+
+=back
+
+=back
+
+=cut
+sub restore_vm_volume_init {
+    my ($self, $volname, $storeid, $devicename, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_vm_volume_cleanup>
+
+For VM backups, clean up after the restore of a given volume. Called in both,
+success and failure scenarios.
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=item C<$devicename>
+
+The device name associated to the volume that should be prepared for the
+restore. Same as the argument to the backup routine when the backup was created.
+
+=item C<$info>
+
+A hash reference with optional and mechanism-specific parameters. Currently
+empty.
+
+=back
+
+=back
+
+=cut
+sub restore_vm_volume_cleanup {
+    my ($self, $volname, $storeid, $devicename, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_container_init>
+
+Prepare a container archive for restore. Returns a hash reference with the
+mechanism-specific information for the restore:
+
+=over
+
+=item C<tar>
+
+    { 'tar-path' => $path }
+
+The archive will be restored via the C<tar> command.
+
+=over
+
+=item C<$path>
+
+The path to the tar archive containing the full filesystem structure of the
+container.
+
+=back
+
+=item C<directory>
+
+    { 'archive-directory' => $path }
+
+The archive will be restored via C<rsync> from a directory containing the full
+filesystem structure of the container.
+
+=over
+
+=item C<$path>
+
+The path to the directory containing the full filesystem structure of the
+container.
+
+=back
+
+=back
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=item C<$info>
+
+A hash reference with optional and mechanism-specific parameters. Currently
+empty.
+
+=back
+
+=back
+
+=cut
+sub restore_container_init {
+    my ($self, $volname, $storeid, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+=pod
+
+=over
+
+=item C<restore_container_cleanup>
+
+For container backups, clean up after the restore. Called in both, success and
+failure scenarios.
+
+Parameters:
+
+=over
+
+=item C<$volname>
+
+The volume ID of the archive being restored.
+
+=item C<$storeid>
+
+The storage ID of the backup storage.
+
+=item C<$info>
+
+A hash reference with optional and mechanism-specific parameters. Currently
+empty.
+
+=back
+
+=back
+
+=cut
+sub restore_container_cleanup {
+    my ($self, $volname, $storeid, $info) = @_;
+
+    die "implement me in subclass";
+}
+
+1;
diff --git a/src/PVE/BackupProvider/Plugin/Makefile b/src/PVE/BackupProvider/Plugin/Makefile
new file mode 100644
index 0000000..bbd7431
--- /dev/null
+++ b/src/PVE/BackupProvider/Plugin/Makefile
@@ -0,0 +1,5 @@
+SOURCES = Base.pm
+
+.PHONY: install
+install:
+	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/BackupProvider/Plugin/$$i; done
diff --git a/src/PVE/Makefile b/src/PVE/Makefile
index d438804..8605a40 100644
--- a/src/PVE/Makefile
+++ b/src/PVE/Makefile
@@ -5,6 +5,7 @@ install:
 	install -D -m 0644 Storage.pm ${DESTDIR}${PERLDIR}/PVE/Storage.pm
 	install -D -m 0644 Diskmanage.pm ${DESTDIR}${PERLDIR}/PVE/Diskmanage.pm
 	install -D -m 0644 CephConfig.pm ${DESTDIR}${PERLDIR}/PVE/CephConfig.pm
+	make -C BackupProvider install
 	make -C Storage install
 	make -C API2 install
 	make -C CLI install
diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index e251056..69500bf 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -42,11 +42,11 @@ use PVE::Storage::BTRFSPlugin;
 use PVE::Storage::ESXiPlugin;
 
 # Storage API version. Increment it on changes in storage API interface.
-use constant APIVER => 10;
+use constant APIVER => 11;
 # Age is the number of versions we're backward compatible with.
 # This is like having 'current=APIVER' and age='APIAGE' in libtool,
 # see https://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html
-use constant APIAGE => 1;
+use constant APIAGE => 2;
 
 our $KNOWN_EXPORT_FORMATS = ['raw+size', 'tar+size', 'qcow2+size', 'vmdk+size', 'zfs', 'btrfs'];
 
@@ -2002,6 +2002,14 @@ sub volume_export_start {
     PVE::Tools::run_command($cmds, %$run_command_params);
 }
 
+sub new_backup_provider {
+    my ($cfg, $storeid, $log_function) = @_;
+
+    my $scfg = storage_config($cfg, $storeid);
+    my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
+    return $plugin->new_backup_provider($scfg, $storeid, $log_function);
+}
+
 # bash completion helper
 
 sub complete_storage {
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 6071e45..3d847e9 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -1769,6 +1769,21 @@ sub rename_volume {
     return "${storeid}:${base}${target_vmid}/${target_volname}";
 }
 
+# Used by storage plugins for external backup providers. See PVE::BackupProvider::Plugin for the API
+# the provider needs to implement.
+#
+# $scfg - the storage configuration
+# $storeid - the storage ID
+# $log_function($log_level, $message) - this log function can be used to write to the backup task
+#   log in Proxmox VE. $log_level is 'info', 'warn' or 'err', $message is the message to be printed.
+#
+# Returns a blessed reference to the backup provider class.
+sub new_backup_provider {
+    my ($class, $scfg, $storeid, $log_function) = @_;
+
+    die "implement me if enabling the feature 'backup-provider' in plugindata()->{features}\n";
+}
+
 sub config_aware_base_mkdir {
     my ($class, $scfg, $path) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC storage v3 13/34] extract backup config: delegate to backup provider for storages that support it
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (11 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC storage v3 12/34] plugin: introduce new_backup_provider() method Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [POC storage v3 14/34] add backup provider example Fiona Ebner
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* use new storage_has_feature() helper

 src/PVE/Storage.pm | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index 69500bf..9f9a86b 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -1734,6 +1734,17 @@ sub extract_vzdump_config {
 	    storage_check_enabled($cfg, $storeid);
 	    return PVE::Storage::PBSPlugin->extract_vzdump_config($scfg, $volname, $storeid);
 	}
+
+	if (storage_has_feature($cfg, $storeid, 'backup-provider')) {
+	    my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
+	    my $log_function = sub {
+		my ($log_level, $message) = @_;
+		my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
+		print "$prefix: $message\n";
+	    };
+	    my $backup_provider = $plugin->new_backup_provider($scfg, $storeid, $log_function);
+	    return $backup_provider->restore_get_guest_config($volname, $storeid);
+	}
     }
 
     my $archive = abs_filesystem_path($cfg, $volid);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [POC storage v3 14/34] add backup provider example
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (12 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC storage v3 13/34] extract backup config: delegate to backup provider for storages that support it Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-13 10:52   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [POC storage v3 15/34] WIP Borg plugin Fiona Ebner
                   ` (20 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

The example uses a simple directory structure to save the backups,
grouped by guest ID. VM backups are saved as configuration files and
qcow2 images, with backing files when doing incremental backups.
Container backups are saved as configuration files and a tar file or
squashfs image (added to test the 'directory' restore mechanism).

Whether to use incremental VM backups and which backup mechanisms to
use can be configured in the storage configuration.

The 'nbdinfo' binary from the 'libnbd-bin' package is required for
backup mechanism 'nbd' for VM backups, the 'mksquashfs' binary from the
'squashfs-tools' package is required for backup mechanism 'squashfs' for
containers.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* adapt to API changes
* use NBD export when restoring VM image, to make incremental backups
  using qcow2 chains work again

 .../BackupProvider/Plugin/DirectoryExample.pm | 697 ++++++++++++++++++
 src/PVE/BackupProvider/Plugin/Makefile        |   2 +-
 .../Custom/BackupProviderDirExamplePlugin.pm  | 307 ++++++++
 src/PVE/Storage/Custom/Makefile               |   5 +
 src/PVE/Storage/Makefile                      |   1 +
 5 files changed, 1011 insertions(+), 1 deletion(-)
 create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm
 create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
 create mode 100644 src/PVE/Storage/Custom/Makefile

diff --git a/src/PVE/BackupProvider/Plugin/DirectoryExample.pm b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm
new file mode 100644
index 0000000..99825ef
--- /dev/null
+++ b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm
@@ -0,0 +1,697 @@
+package PVE::BackupProvider::Plugin::DirectoryExample;
+
+use strict;
+use warnings;
+
+use Fcntl qw(SEEK_SET);
+use File::Path qw(make_path remove_tree);
+use IO::File;
+use IPC::Open3;
+
+use PVE::Storage::Plugin;
+use PVE::Tools qw(file_get_contents file_read_firstline file_set_contents run_command);
+
+use base qw(PVE::BackupProvider::Plugin::Base);
+
+use constant {
+    BLKDISCARD => 0x1277, # see linux/fs.h
+};
+
+# Private helpers
+
+my sub log_info {
+    my ($self, $message) = @_;
+
+    $self->{'log-function'}->('info', $message);
+}
+
+my sub log_warning {
+    my ($self, $message) = @_;
+
+    $self->{'log-function'}->('warn', $message);
+}
+
+my sub log_error {
+    my ($self, $message) = @_;
+
+    $self->{'log-function'}->('err', $message);
+}
+
+# Try to use the same bitmap ID as last time for incremental backup if the storage is configured for
+# incremental VM backup. Need to start fresh if there is no previous ID or the associated backup
+# doesn't exist.
+my sub get_bitmap_id {
+    my ($self, $vmid, $vmtype) = @_;
+
+    return if $self->{'storage-plugin'}->get_vm_backup_mode($self->{scfg}) ne 'incremental';
+
+    my $previous_info_dir = "$self->{scfg}->{path}/$vmid/";
+
+    my $previous_info_file = "$previous_info_dir/previous-info";
+    my $info = file_read_firstline($previous_info_file) // '';
+    $self->{$vmid}->{'old-previous-info'} = $info;
+    my ($bitmap_id, $previous_backup_id) = $info =~ m/^(\d+)\s+(\d+)$/;
+    my $previous_backup_dir =
+	$previous_backup_id ? "$self->{scfg}->{path}/$vmid/$vmtype-$previous_backup_id" : undef;
+
+    if ($bitmap_id && -d $previous_backup_dir) {
+	$self->{$vmid}->{'previous-backup-dir'} = $previous_backup_dir;
+    } else {
+	# need to start fresh if there is no previous ID or the associated backup doesn't exist
+	$bitmap_id = $self->{$vmid}->{'backup-time'};
+    }
+
+    $self->{$vmid}->{'bitmap-id'} = $bitmap_id;
+    make_path($previous_info_dir);
+    die "unable to create directory $previous_info_dir\n" if !-d $previous_info_dir;
+    file_set_contents($previous_info_file, "$bitmap_id $self->{$vmid}->{'backup-time'}");
+
+    return $bitmap_id;
+}
+
+# Backup Provider API
+
+sub new {
+    my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_;
+
+    my $self = bless {
+	scfg => $scfg,
+	storeid => $storeid,
+	'storage-plugin' => $storage_plugin,
+	'log-function' => $log_function,
+    }, $class;
+
+    return $self;
+}
+
+sub provider_name {
+    my ($self) = @_;
+
+    return 'dir provider example';
+}
+
+# Hooks
+
+my sub job_start {
+    my ($self, $start_time) = @_;
+
+    log_info($self, "job start hook called");
+
+    run_command(["modprobe", "nbd"]);
+
+    log_info($self, "backup provider initialized successfully for new job $start_time");
+}
+
+sub job_hook {
+    my ($self, $phase, $info) = @_;
+
+    if ($phase eq 'start') {
+	job_start($self, $info->{'start-time'});
+    } elsif ($phase eq 'end') {
+	log_info($self, "job end hook called");
+    } elsif ($phase eq 'abort') {
+	log_info($self, "job abort hook called with error - $info->{error}");
+    }
+
+    # ignore unknown phase
+
+    return;
+}
+
+my sub backup_start {
+    my ($self, $vmid, $vmtype, $backup_time) = @_;
+
+    log_info($self, "backup start hook called");
+
+    my $backup_dir = $self->{scfg}->{path} . "/" . $self->{$vmid}->{archive};
+
+    make_path($backup_dir);
+    die "unable to create directory $backup_dir\n" if !-d $backup_dir;
+
+    $self->{$vmid}->{'backup-time'} = $backup_time;
+    $self->{$vmid}->{'backup-dir'} = $backup_dir;
+    $self->{$vmid}->{'task-size'} = 0;
+}
+
+my sub backup_abort {
+    my ($self, $vmid, $error) = @_;
+
+    log_info($self, "backup abort hook called");
+
+    $self->{$vmid}->{failed} = 1;
+
+
+    if (my $dir = $self->{$vmid}->{'backup-dir'}) {
+	eval { remove_tree($dir) };
+	$self->{'log-warning'}->("unable to clean up $dir - $@") if $@;
+    }
+
+    # Restore old previous-info so next attempt can re-use bitmap again
+    if (my $info = $self->{$vmid}->{'old-previous-info'}) {
+	my $previous_info_dir = "$self->{scfg}->{path}/$vmid/";
+	my $previous_info_file = "$previous_info_dir/previous-info";
+	file_set_contents($previous_info_file, $info);
+    }
+}
+
+sub backup_hook {
+    my ($self, $phase, $vmid, $vmtype, $info) = @_;
+
+    if ($phase eq 'start') {
+	backup_start($self, $vmid, $vmtype, $info->{'start-time'});
+    } elsif ($phase eq 'end') {
+	log_info($self, "backup end hook called");
+    } elsif ($phase eq 'abort') {
+	backup_abort($self, $vmid, $info->{error});
+    } elsif ($phase eq 'prepare') {
+	my $dir = $self->{$vmid}->{'backup-dir'};
+	chown($info->{'backup-user-id'}, -1, $dir)
+	    or die "unable to change owner for $dir\n";
+    }
+
+    # ignore unknown phase
+
+    return;
+}
+
+sub backup_get_mechanism {
+    my ($self, $vmid, $vmtype) = @_;
+
+    return ('directory', undef) if $vmtype eq 'lxc';
+
+    if ($vmtype eq 'qemu') {
+	my $backup_mechanism = $self->{'storage-plugin'}->get_vm_backup_mechanism($self->{scfg});
+	return ($backup_mechanism, get_bitmap_id($self, $vmid, $vmtype));
+    }
+
+    die "unsupported guest type '$vmtype'\n";
+}
+
+sub backup_get_archive_name {
+    my ($self, $vmid, $vmtype, $backup_time) = @_;
+
+    return $self->{$vmid}->{archive} = "${vmid}/${vmtype}-${backup_time}";
+}
+
+sub backup_get_task_size {
+    my ($self, $vmid) = @_;
+
+    return $self->{$vmid}->{'task-size'};
+}
+
+sub backup_handle_log_file {
+    my ($self, $vmid, $filename) = @_;
+
+    my $log_dir = $self->{$vmid}->{'backup-dir'};
+    if ($self->{$vmid}->{failed}) {
+	$log_dir .= ".failed";
+    }
+    make_path($log_dir);
+    die "unable to create directory $log_dir\n" if !-d $log_dir;
+
+    my $data = file_get_contents($filename);
+    my $target = "${log_dir}/backup.log";
+    file_set_contents($target, $data);
+}
+
+my sub backup_block_device {
+    my ($self, $vmid, $devicename, $size, $path, $bitmap_mode, $next_dirty_region, $bandwidth_limit) = @_;
+
+    # TODO honor bandwidth_limit
+
+    my $previous_backup_dir = $self->{$vmid}->{'previous-backup-dir'};
+    my $incremental = $previous_backup_dir && $bitmap_mode eq 'reuse';
+    my $target = "$self->{$vmid}->{'backup-dir'}/${devicename}.qcow2";
+    my $target_base = $incremental ? "${previous_backup_dir}/${devicename}.qcow2" : undef;
+    my $create_cmd = ["qemu-img", "create", "-f", "qcow2", $target, $size];
+    push $create_cmd->@*, "-b", $target_base, "-F", "qcow2" if $target_base;
+    run_command($create_cmd);
+
+    eval {
+	# allows to easily write to qcow2 target
+	run_command(["qemu-nbd", "-c", "/dev/nbd15", $target, "--format=qcow2"]);
+
+	my $block_size = 4 * 1024 * 1024; # 4 MiB
+
+	my $in_fh = IO::File->new($path, "r+")
+	    or die "unable to open NBD backup source - $!\n";
+	my $out_fh = IO::File->new("/dev/nbd15", "r+")
+	    or die "unable to open NBD backup target - $!\n";
+
+	my $buffer = '';
+
+	while (scalar((my $region_offset, my $region_length) = $next_dirty_region->())) {
+	    sysseek($in_fh, $region_offset, SEEK_SET)
+		// die "unable to seek '$region_offset' in NBD backup source - $!";
+	    sysseek($out_fh, $region_offset, SEEK_SET)
+		// die "unable to seek '$region_offset' in NBD backup target - $!";
+
+	    my $local_offset = 0; # within the region
+	    while ($local_offset < $region_length) {
+		my $remaining = $region_length - $local_offset;
+		my $request_size = $remaining < $block_size ? $remaining : $block_size;
+		my $offset = $region_offset + $local_offset;
+
+		my $read = sysread($in_fh, $buffer, $request_size);
+
+		die "failed to read from backup source - $!\n" if !defined($read);
+		die "premature EOF while reading backup source\n" if $read == 0;
+
+		my $written = 0;
+		while ($written < $read) {
+		    my $res = syswrite($out_fh, $buffer, $request_size - $written, $written);
+		    die "failed to write to backup target - $!\n" if !defined($res);
+		    die "unable to progress writing to backup target\n" if $res == 0;
+		    $written += $res;
+		}
+
+		ioctl($in_fh, BLKDISCARD, pack('QQ', int($offset), int($request_size)));
+
+		$local_offset += $request_size;
+	    }
+	}
+    };
+    my $err = $@;
+
+    eval { run_command(["qemu-nbd", "-d", "/dev/nbd15" ]); };
+    $self->{'log-warning'}->("unable to disconnect NBD backup target - $@") if $@;
+
+    die $err if $err;
+}
+
+my sub backup_nbd {
+    my ($self, $vmid, $devicename, $size, $nbd_path, $bitmap_mode, $bitmap_name, $bandwidth_limit) = @_;
+
+    # TODO honor bandwidth_limit
+
+    die "need 'nbdinfo' binary from package libnbd-bin\n" if !-e "/usr/bin/nbdinfo";
+
+    my $nbd_info_uri = "nbd+unix:///${devicename}?socket=${nbd_path}";
+    my $qemu_nbd_uri = "nbd:unix:${nbd_path}:exportname=${devicename}";
+
+    my $cpid;
+    my $error_fh;
+    my $next_dirty_region;
+
+    # If there is no dirty bitmap, it can be treated as if there's a full dirty one. The output of
+    # nbdinfo is a list of tuples with offset, length, type, description. The first bit of 'type' is
+    # set when the bitmap is dirty, see QEMU's docs/interop/nbd.txt
+    my $dirty_bitmap = [];
+    if ($bitmap_mode ne 'none') {
+	my $input = IO::File->new();
+	my $info = IO::File->new();
+	$error_fh = IO::File->new();
+	my $nbdinfo_cmd = ["nbdinfo", $nbd_info_uri, "--map=qemu:dirty-bitmap:${bitmap_name}"];
+	$cpid = open3($input, $info, $error_fh, $nbdinfo_cmd->@*)
+	    or die "failed to spawn nbdinfo child - $!\n";
+
+	$next_dirty_region = sub {
+	    my ($offset, $length, $type);
+	    do {
+		my $line = <$info>;
+		return if !$line;
+		die "unexpected output from nbdinfo - $line\n"
+		    if $line !~ m/^\s*(\d+)\s*(\d+)\s*(\d+)/; # also untaints
+		($offset, $length, $type) = ($1, $2, $3);
+	    } while (($type & 0x1) == 0); # not dirty
+	    return ($offset, $length);
+	};
+    } else {
+	my $done = 0;
+	$next_dirty_region = sub {
+	    return if $done;
+	    $done = 1;
+	    return (0, $size);
+	};
+    }
+
+    eval {
+	run_command(["qemu-nbd", "-c", "/dev/nbd0", $qemu_nbd_uri, "--format=raw", "--discard=on"]);
+
+	backup_block_device(
+	    $self,
+	    $vmid,
+	    $devicename,
+	    $size,
+	    '/dev/nbd0',
+	    $bitmap_mode,
+	    $next_dirty_region,
+	    $bandwidth_limit,
+	);
+    };
+    my $err = $@;
+
+    eval { run_command(["qemu-nbd", "-d", "/dev/nbd0" ]); };
+    $self->{'log-warning'}->("unable to disconnect NBD backup source - $@") if $@;
+
+    if ($cpid) {
+	my $waited;
+	my $wait_limit = 5;
+	for ($waited = 0; $waited < $wait_limit && waitpid($cpid, POSIX::WNOHANG) == 0; $waited++) {
+	    kill 15, $cpid if $waited == 0;
+	    sleep 1;
+	}
+	if ($waited == $wait_limit) {
+	    kill 9, $cpid;
+	    sleep 1;
+	    $self->{'log-warning'}->("unable to collect nbdinfo child process")
+		if waitpid($cpid, POSIX::WNOHANG) == 0;
+	}
+    }
+
+    die $err if $err;
+}
+
+my sub backup_vm_volume {
+    my ($self, $vmid, $devicename, $info, $bandwidth_limit) = @_;
+
+    my $backup_mechanism = $self->{'storage-plugin'}->get_vm_backup_mechanism($self->{scfg});
+
+    if ($backup_mechanism eq 'nbd') {
+	backup_nbd(
+	    $self,
+	    $vmid,
+	    $devicename,
+	    $info->{size},
+	    $info->{'nbd-path'},
+	    $info->{'bitmap-mode'},
+	    $info->{'bitmap-name'},
+	    $bandwidth_limit,
+	);
+    } elsif ($backup_mechanism eq 'block-device') {
+	backup_block_device(
+	    $self,
+	    $vmid,
+	    $devicename,
+	    $info->{size},
+	    $info->{path},
+	    $info->{'bitmap-mode'},
+	    $info->{'next-dirty-region'},
+	    $bandwidth_limit,
+	);
+    } else {
+	die "internal error - unknown VM backup mechansim '$backup_mechanism'\n";
+    }
+}
+
+sub backup_vm {
+    my ($self, $vmid, $guest_config, $volumes, $info) = @_;
+
+    my $target = "$self->{$vmid}->{'backup-dir'}/guest.conf";
+    file_set_contents($target, $guest_config);
+
+    $self->{$vmid}->{'task-size'} += -s $target;
+
+    if (my $firewall_config = $info->{'firewall-config'}) {
+	$target = "$self->{$vmid}->{'backup-dir'}/firewall.conf";
+	file_set_contents($target, $firewall_config);
+
+	$self->{$vmid}->{'task-size'} += -s $target;
+    }
+
+    for my $devicename (sort keys $volumes->%*) {
+	backup_vm_volume(
+	    $self, $vmid, $devicename, $volumes->{$devicename}, $info->{'bandwidth-limit'});
+    }
+}
+
+my sub backup_directory_tar {
+    my ($self, $vmid, $directory, $exclude_patterns, $sources, $bandwidth_limit) = @_;
+
+    # essentially copied from PVE/VZDump/LXC.pm' archive()
+
+    # copied from PVE::Storage::Plugin::COMMON_TAR_FLAGS
+    my @tar_flags = qw(
+	--one-file-system
+	-p --sparse --numeric-owner --acls
+	--xattrs --xattrs-include=user.* --xattrs-include=security.capability
+	--warning=no-file-ignored --warning=no-xattr-write
+    );
+
+    my $tar = ['tar', 'cpf', '-', '--totals', @tar_flags];
+
+    push @$tar, "--directory=$directory";
+
+    my @exclude_no_anchored = ();
+    my @exclude_anchored = ();
+    for my $pattern ($exclude_patterns->@*) {
+	if ($pattern !~ m|^/|) {
+	    push @exclude_no_anchored, $pattern;
+	} else {
+	    push @exclude_anchored, $pattern;
+	}
+    }
+
+    push @$tar, '--no-anchored';
+    push @$tar, '--exclude=lost+found';
+    push @$tar, map { "--exclude=$_" } @exclude_no_anchored;
+
+    push @$tar, '--anchored';
+    push @$tar, map { "--exclude=.$_" } @exclude_anchored;
+
+    push @$tar, $sources->@*;
+
+    my $cmd = [ $tar ];
+
+    push @$cmd, [ 'cstream', '-t', $bandwidth_limit * 1024 ] if $bandwidth_limit;
+
+    my $target = "$self->{$vmid}->{'backup-dir'}/archive.tar";
+    push @{$cmd->[-1]}, \(">" . PVE::Tools::shellquote($target));
+
+    my $logfunc = sub {
+	my $line = shift;
+	log_info($self, "tar: $line");
+    };
+
+    PVE::Tools::run_command($cmd, logfunc => $logfunc);
+
+    return;
+};
+
+# NOTE This only serves as an example to illustrate the 'directory' restore mechanism. It is not
+# fleshed out properly, e.g. I didn't check if exclusion is compatible with
+# proxmox-backup-client/rsync or xattrs/ACL/etc. work as expected!
+my sub backup_directory_squashfs {
+    my ($self, $vmid, $directory, $exclude_patterns, $bandwidth_limit) = @_;
+
+    my $target = "$self->{$vmid}->{'backup-dir'}/archive.sqfs";
+
+    my $mksquashfs = ['mksquashfs', $directory, $target, '-quiet', '-no-progress'];
+
+    push $mksquashfs->@*, '-wildcards';
+
+    for my $pattern ($exclude_patterns->@*) {
+	if ($pattern !~ m|^/|) { # non-anchored
+	    push $mksquashfs->@*, '-e', "... $pattern";
+	} else { # anchored
+	    push $mksquashfs->@*, '-e', substr($pattern, 1); # need to strip leading slash
+	}
+    }
+
+    my $cmd = [ $mksquashfs ];
+
+    push @$cmd, [ 'cstream', '-t', $bandwidth_limit * 1024 ] if $bandwidth_limit;
+
+    my $logfunc = sub {
+	my $line = shift;
+	log_info($self, "mksquashfs: $line");
+    };
+
+    PVE::Tools::run_command($cmd, logfunc => $logfunc);
+
+    return;
+};
+
+sub backup_container {
+    my ($self, $vmid, $guest_config, $exclude_patterns, $info) = @_;
+
+    my $target = "$self->{$vmid}->{'backup-dir'}/guest.conf";
+    file_set_contents($target, $guest_config);
+
+    $self->{$vmid}->{'task-size'} += -s $target;
+
+    if (my $firewall_config = $info->{'firewall-config'}) {
+	$target = "$self->{$vmid}->{'backup-dir'}/firewall.conf";
+	file_set_contents($target, $firewall_config);
+
+	$self->{$vmid}->{'task-size'} += -s $target;
+    }
+
+    my $backup_mode = $self->{'storage-plugin'}->get_lxc_backup_mode($self->{scfg});
+    if ($backup_mode eq 'tar') {
+	backup_directory_tar(
+	    $self,
+	    $vmid,
+	    $info->{directory},
+	    $exclude_patterns,
+	    $info->{sources},
+	    $info->{'bandwidth-limit'},
+	);
+    } elsif ($backup_mode eq 'squashfs') {
+	backup_directory_squashfs(
+	    $self,
+	    $vmid,
+	    $info->{directory},
+	    $exclude_patterns,
+	    $info->{'bandwidth-limit'},
+	);
+    } else {
+	die "got unexpected backup mode '$backup_mode' from storage plugin\n";
+    }
+}
+
+# Restore API
+
+sub restore_get_mechanism {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
+    my ($vmtype) = $relative_backup_dir =~ m!^\d+/([a-z]+)-!;
+
+    return ('qemu-img', $vmtype) if $vmtype eq 'qemu';
+
+    if ($vmtype eq 'lxc') {
+	my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
+
+	if (-e "$self->{scfg}->{path}/${relative_backup_dir}/archive.tar") {
+	    $self->{'restore-mechanisms'}->{$volname} = 'tar';
+	    return ('tar', $vmtype);
+	}
+
+	if (-e "$self->{scfg}->{path}/${relative_backup_dir}/archive.sqfs") {
+	    $self->{'restore-mechanisms'}->{$volname} = 'directory';
+	    return ('directory', $vmtype)
+	}
+
+	die "unable to find archive '$volname'\n";
+    }
+
+    die "cannot restore unexpected guest type '$vmtype'\n";
+}
+
+sub restore_get_guest_config {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $filename = "$self->{scfg}->{path}/${relative_backup_dir}/guest.conf";
+
+    return file_get_contents($filename);
+}
+
+sub restore_get_firewall_config {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $filename = "$self->{scfg}->{path}/${relative_backup_dir}/firewall.conf";
+
+    return if !-e $filename;
+
+    return file_get_contents($filename);
+}
+
+sub restore_vm_init {
+    my ($self, $volname, $storeid) = @_;
+
+    my $res = {};
+
+    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $backup_dir = "$self->{scfg}->{path}/${relative_backup_dir}";
+
+    my @backup_files = glob("$backup_dir/*");
+    for my $backup_file (@backup_files) {
+	next if $backup_file !~ m!^(.*/(.*)\.qcow2)$!;
+	$backup_file = $1; # untaint
+	$res->{$2}->{size} = PVE::Storage::Plugin::file_size_info($backup_file);
+    }
+
+    return $res;
+}
+
+sub restore_vm_cleanup {
+    my ($self, $volname, $storeid) = @_;
+
+    return; # nothing to do
+}
+
+sub restore_vm_volume_init {
+    my ($self, $volname, $storeid, $devicename, $info) = @_;
+
+    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $image = "$self->{scfg}->{path}/${relative_backup_dir}/${devicename}.qcow2";
+    # NOTE Backing files are not allowed by Proxmox VE when restoring. The reason is that an
+    # untrusted qcow2 image can specify an arbitrary backing file and thus leak data from the host.
+    # For the sake of the directory example plugin, an NBD export is created, but this side-steps
+    # the check and would allow the attack again. An actual implementation should check that the
+    # backing file (or rather, the whole backing chain) is safe first!
+    PVE::Tools::run_command(['qemu-nbd', '-c', '/dev/nbd7', $image]);
+    return {
+	'qemu-img-path' => '/dev/nbd7',
+    };
+}
+
+sub restore_vm_volume_cleanup {
+    my ($self, $volname, $storeid, $devicename, $info) = @_;
+
+    PVE::Tools::run_command(['qemu-nbd', '-d', '/dev/nbd7']);
+
+    return;
+}
+
+my sub restore_tar_init {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
+    return { 'tar-path' => "$self->{scfg}->{path}/${relative_backup_dir}/archive.tar" };
+}
+
+my sub restore_directory_init {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $relative_backup_dir, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $archive = "$self->{scfg}->{path}/${relative_backup_dir}/archive.sqfs";
+
+    my $mount_point = "/run/backup-provider-example/${vmid}.mount";
+    make_path($mount_point);
+    die "unable to create directory $mount_point\n" if !-d $mount_point;
+
+    run_command(['mount', '-o', 'ro', $archive, $mount_point]);
+
+    return { 'archive-directory' => $mount_point };
+}
+
+my sub restore_directory_cleanup {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, undef, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $mount_point = "/run/backup-provider-example/${vmid}.mount";
+
+    run_command(['umount', $mount_point]);
+
+    return;
+}
+
+sub restore_container_init {
+    my ($self, $volname, $storeid, $info) = @_;
+
+    if ($self->{'restore-mechanisms'}->{$volname} eq 'tar') {
+	return restore_tar_init($self, $volname, $storeid);
+    } elsif ($self->{'restore-mechanisms'}->{$volname} eq 'directory') {
+	return restore_directory_init($self, $volname, $storeid);
+    } else {
+	die "no restore mechanism set for '$volname'\n";
+    }
+}
+
+sub restore_container_cleanup {
+    my ($self, $volname, $storeid, $info) = @_;
+
+    if ($self->{'restore-mechanisms'}->{$volname} eq 'tar') {
+	return; # nothing to do
+    } elsif ($self->{'restore-mechanisms'}->{$volname} eq 'directory') {
+	return restore_directory_cleanup($self, $volname, $storeid);
+    } else {
+	die "no restore mechanism set for '$volname'\n";
+    }
+}
+
+1;
diff --git a/src/PVE/BackupProvider/Plugin/Makefile b/src/PVE/BackupProvider/Plugin/Makefile
index bbd7431..bedc26e 100644
--- a/src/PVE/BackupProvider/Plugin/Makefile
+++ b/src/PVE/BackupProvider/Plugin/Makefile
@@ -1,4 +1,4 @@
-SOURCES = Base.pm
+SOURCES = Base.pm DirectoryExample.pm
 
 .PHONY: install
 install:
diff --git a/src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm b/src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
new file mode 100644
index 0000000..5152923
--- /dev/null
+++ b/src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
@@ -0,0 +1,307 @@
+package PVE::Storage::Custom::BackupProviderDirExamplePlugin;
+
+use strict;
+use warnings;
+
+use File::Basename qw(basename);
+
+use PVE::BackupProvider::Plugin::DirectoryExample;
+use PVE::Tools;
+
+use base qw(PVE::Storage::Plugin);
+
+# Helpers
+
+sub get_vm_backup_mechanism {
+    my ($class, $scfg) = @_;
+
+    return $scfg->{'vm-backup-mechanism'} // properties()->{'vm-backup-mechanism'}->{'default'};
+}
+
+sub get_vm_backup_mode {
+    my ($class, $scfg) = @_;
+
+    return $scfg->{'vm-backup-mode'} // properties()->{'vm-backup-mode'}->{'default'};
+}
+
+sub get_lxc_backup_mode {
+    my ($class, $scfg) = @_;
+
+    return $scfg->{'lxc-backup-mode'} // properties()->{'lxc-backup-mode'}->{'default'};
+}
+
+# Configuration
+
+sub api {
+    return 11;
+}
+
+sub type {
+    return 'backup-provider-dir-example';
+}
+
+sub plugindata {
+    return {
+	content => [ { backup => 1, none => 1 }, { backup => 1 } ],
+	features => { 'backup-provider' => 1 },
+    };
+}
+
+sub properties {
+    return {
+	'lxc-backup-mode' => {
+	    description => "How to create LXC backups. tar - create a tar archive."
+		." squashfs - create a squashfs image. Requires squashfs-tools to be installed.",
+	    type => 'string',
+	    enum => [qw(tar squashfs)],
+	    default => 'tar',
+	},
+	'vm-backup-mechanism' => {
+	    description => "Which mechanism to use for creating VM backups. nbd - access data via "
+		." NBD export. block-device - access data via regular block device.",
+	    type => 'string',
+	    enum => [qw(nbd block-device)],
+	    default => 'block-device',
+	},
+	'vm-backup-mode' => {
+	    description => "How to create VM backups. full - always create full backups."
+		." incremental - create incremental backups when possible, fallback to full when"
+		." necessary, e.g. VM disk's bitmap is invalid.",
+	    type => 'string',
+	    enum => [qw(full incremental)],
+	    default => 'full',
+	},
+    };
+}
+
+sub options {
+    return {
+	path => { fixed => 1 },
+	'lxc-backup-mode' => { optional => 1 },
+	'vm-backup-mechanism' => { optional => 1 },
+	'vm-backup-mode' => { optional => 1 },
+	disable => { optional => 1 },
+	nodes => { optional => 1 },
+	'prune-backups' => { optional => 1 },
+	'max-protected-backups' => { optional => 1 },
+    };
+}
+
+# Storage implementation
+
+# NOTE a proper backup storage should implement this
+sub prune_backups {
+    my ($class, $scfg, $storeid, $keep, $vmid, $type, $dryrun, $logfunc) = @_;
+
+    die "not implemented";
+}
+
+sub parse_volname {
+    my ($class, $volname) = @_;
+
+    if ($volname =~ m!^backup/((\d+)/[a-z]+-\d+)$!) {
+	my ($filename, $vmid) = ($1, $2);
+	return ('backup', $filename, $vmid);
+    }
+
+    die "unable to parse volume name '$volname'\n";
+}
+
+sub path {
+    my ($class, $scfg, $volname, $storeid, $snapname) = @_;
+
+    die "volume snapshot is not possible on backup-provider-dir-example volume" if $snapname;
+
+    my ($type, $filename, $vmid) = $class->parse_volname($volname);
+
+    return ("$scfg->{path}/${filename}", $vmid, $type);
+}
+
+sub create_base {
+    my ($class, $storeid, $scfg, $volname) = @_;
+
+    die "cannot create base image in backup-provider-dir-example storage\n";
+}
+
+sub clone_image {
+    my ($class, $scfg, $storeid, $volname, $vmid, $snap) = @_;
+
+    die "can't clone images in backup-provider-dir-example storage\n";
+}
+
+sub alloc_image {
+    my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
+
+    die "can't allocate space in backup-provider-dir-example storage\n";
+}
+
+# NOTE a proper backup storage should implement this
+sub free_image {
+    my ($class, $storeid, $scfg, $volname, $isBase) = @_;
+
+    # if it's a backing file, it would need to be merged into the upper image first.
+
+    die "not implemented";
+}
+
+sub list_images {
+    my ($class, $storeid, $scfg, $vmid, $vollist, $cache) = @_;
+
+    my $res = [];
+
+    return $res;
+}
+
+sub list_volumes {
+    my ($class, $storeid, $scfg, $vmid, $content_types) = @_;
+
+    my $path = $scfg->{path};
+
+    my $res = [];
+    for my $type ($content_types->@*) {
+	next if $type ne 'backup';
+
+	my @guest_dirs = glob("$path/*");
+	for my $guest_dir (@guest_dirs) {
+	    next if !-d $guest_dir || $guest_dir !~ m!/(\d+)$!;
+
+	    my $backup_vmid = basename($guest_dir);
+
+	    next if defined($vmid) && $backup_vmid != $vmid;
+
+	    my @backup_dirs = glob("$guest_dir/*");
+	    for my $backup_dir (@backup_dirs) {
+		next if !-d $backup_dir || $backup_dir !~ m!/(lxc|qemu)-(\d+)$!;
+		my ($subtype, $backup_id) = ($1, $2);
+
+		my $size = 0;
+		my @backup_files = glob("$backup_dir/*");
+		$size += -s $_ for @backup_files;
+
+		push $res->@*, {
+		    volid => "$storeid:backup/${backup_vmid}/${subtype}-${backup_id}",
+		    vmid => $backup_vmid,
+		    format => "directory",
+		    ctime => $backup_id,
+		    size => $size,
+		    subtype => $subtype,
+		    content => $type,
+		    # TODO parent for incremental
+		};
+	    }
+	}
+    }
+
+    return $res;
+}
+
+sub activate_storage {
+    my ($class, $storeid, $scfg, $cache) = @_;
+
+    my $path = $scfg->{path};
+
+    my $timeout = 2;
+    if (!PVE::Tools::run_fork_with_timeout($timeout, sub {-d $path})) {
+	die "unable to activate storage '$storeid' - directory '$path' does not exist or is"
+	    ." unreachable\n";
+    }
+
+    return 1;
+}
+
+sub deactivate_storage {
+    my ($class, $storeid, $scfg, $cache) = @_;
+
+    return 1;
+}
+
+sub activate_volume {
+    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
+
+    die "volume snapshot is not possible on backup-provider-dir-example volume" if $snapname;
+
+    return 1;
+}
+
+sub deactivate_volume {
+    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
+
+    die "volume snapshot is not possible on backup-provider-dir-example volume" if $snapname;
+
+    return 1;
+}
+
+sub get_volume_attribute {
+    my ($class, $scfg, $storeid, $volname, $attribute) = @_;
+
+    return;
+}
+
+# NOTE a proper backup storage should implement this to support backup notes and
+# setting protected status.
+sub update_volume_attribute {
+    my ($class, $scfg, $storeid, $volname, $attribute, $value) = @_;
+
+    die "attribute '$attribute' is not supported on backup-provider-dir-example volume";
+}
+
+sub volume_size_info {
+    my ($class, $scfg, $storeid, $volname, $timeout) = @_;
+
+    my (undef, $relative_backup_dir) = $class->parse_volname($volname);
+    my ($ctime) = $relative_backup_dir =~ m/-(\d+)$/;
+    my $backup_dir = "$scfg->{path}/${relative_backup_dir}";
+
+    my $size = 0;
+    my @backup_files = glob("$backup_dir/*");
+    for my $backup_file (@backup_files) {
+	if ($backup_file =~ m!\.qcow2$!) {
+	    $size += $class->file_size_info($backup_file);
+	} else {
+	    $size += -s $backup_file;
+	}
+    }
+
+    my $parent; # TODO for incremental
+
+    return wantarray ? ($size, 'directory', $size, $parent, $ctime) : $size;
+}
+
+sub volume_resize {
+    my ($class, $scfg, $storeid, $volname, $size, $running) = @_;
+
+    die "volume resize is not possible on backup-provider-dir-example volume";
+}
+
+sub volume_snapshot {
+    my ($class, $scfg, $storeid, $volname, $snap) = @_;
+
+    die "volume snapshot is not possible on backup-provider-dir-example volume";
+}
+
+sub volume_snapshot_rollback {
+    my ($class, $scfg, $storeid, $volname, $snap) = @_;
+
+    die "volume snapshot rollback is not possible on backup-provider-dir-example volume";
+}
+
+sub volume_snapshot_delete {
+    my ($class, $scfg, $storeid, $volname, $snap) = @_;
+
+    die "volume snapshot delete is not possible on backup-provider-dir-example volume";
+}
+
+sub volume_has_feature {
+    my ($class, $scfg, $feature, $storeid, $volname, $snapname, $running) = @_;
+
+    return 0;
+}
+
+sub new_backup_provider {
+    my ($class, $scfg, $storeid, $bandwidth_limit, $log_function) = @_;
+
+    return PVE::BackupProvider::Plugin::DirectoryExample->new(
+	$class, $scfg, $storeid, $bandwidth_limit, $log_function);
+}
+
+1;
diff --git a/src/PVE/Storage/Custom/Makefile b/src/PVE/Storage/Custom/Makefile
new file mode 100644
index 0000000..c1e3eca
--- /dev/null
+++ b/src/PVE/Storage/Custom/Makefile
@@ -0,0 +1,5 @@
+SOURCES = BackupProviderDirExamplePlugin.pm
+
+.PHONY: install
+install:
+	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/Storage/Custom/$$i; done
diff --git a/src/PVE/Storage/Makefile b/src/PVE/Storage/Makefile
index d5cc942..acd37f4 100644
--- a/src/PVE/Storage/Makefile
+++ b/src/PVE/Storage/Makefile
@@ -19,4 +19,5 @@ SOURCES= \
 .PHONY: install
 install:
 	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/Storage/$$i; done
+	make -C Custom install
 	make -C LunCmd install
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [POC storage v3 15/34] WIP Borg plugin
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (13 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [POC storage v3 14/34] add backup provider example Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-13 10:52   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 16/34] move nbd_stop helper to QMPHelpers module Fiona Ebner
                   ` (19 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Archive names start with the guest type and ID and then the same
timestamp format as PBS.

Container archives have the following structure:
guest.config
firewall.config
filesystem/ # containing the whole filesystem structure

VM archives have the following structure
guest.config
firewall.config
volumes/ # containing a raw file for each device

A bindmount (respectively symlinks) are used to achieve this
structure, because Borg doesn't seem to support renaming on-the-fly.
(Prefix stripping via the "slashdot hack" would have helped slightly,
but is only in Borg >= 1.4
https://github.com/borgbackup/borg/actions/runs/7967940995)

NOTE: Bandwidth limit is not yet honored and the task size is not
calculated yet. Discard for VM backups would also be nice to have, but
it's not entirely clear how (parsing progress and discarding according
to that is one idea). There is no dirty bitmap support, not sure if
that is feasible to add.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* make SSH work.
* adapt to API changes, i.e. config as raw data and user namespace
  execution context for containers.

 src/PVE/API2/Storage/Config.pm         |   2 +-
 src/PVE/BackupProvider/Plugin/Borg.pm  | 439 ++++++++++++++++++
 src/PVE/BackupProvider/Plugin/Makefile |   2 +-
 src/PVE/Storage.pm                     |   2 +
 src/PVE/Storage/BorgBackupPlugin.pm    | 595 +++++++++++++++++++++++++
 src/PVE/Storage/Makefile               |   1 +
 6 files changed, 1039 insertions(+), 2 deletions(-)
 create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm
 create mode 100644 src/PVE/Storage/BorgBackupPlugin.pm

diff --git a/src/PVE/API2/Storage/Config.pm b/src/PVE/API2/Storage/Config.pm
index e04b6ab..1cbf09d 100755
--- a/src/PVE/API2/Storage/Config.pm
+++ b/src/PVE/API2/Storage/Config.pm
@@ -190,7 +190,7 @@ __PACKAGE__->register_method ({
 	return &$api_storage_config($cfg, $param->{storage});
     }});
 
-my $sensitive_params = [qw(password encryption-key master-pubkey keyring)];
+my $sensitive_params = [qw(password encryption-key master-pubkey keyring ssh-key)];
 
 __PACKAGE__->register_method ({
     name => 'create',
diff --git a/src/PVE/BackupProvider/Plugin/Borg.pm b/src/PVE/BackupProvider/Plugin/Borg.pm
new file mode 100644
index 0000000..7bb3ae3
--- /dev/null
+++ b/src/PVE/BackupProvider/Plugin/Borg.pm
@@ -0,0 +1,439 @@
+package PVE::BackupProvider::Plugin::Borg;
+
+use strict;
+use warnings;
+
+use File::chdir;
+use File::Basename qw(basename);
+use File::Path qw(make_path remove_tree);
+use Net::IP;
+use POSIX qw(strftime);
+
+use PVE::Tools;
+
+# ($vmtype, $vmid, $time_string)
+our $ARCHIVE_RE_3 = qr!^pve-(lxc|qemu)-([0-9]+)-([0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z)$!;
+
+sub archive_name {
+    my ($vmtype, $vmid, $backup_time) = @_;
+
+    return "pve-${vmtype}-${vmid}-" . strftime("%FT%TZ", gmtime($backup_time));
+}
+
+# remove_tree can be very verbose by default, do explicit error handling and limit to one message
+my sub _remove_tree {
+    my ($path) = @_;
+
+    remove_tree($path, { error => \my $err });
+    if ($err && @$err) { # empty array if no error
+	for my $diag (@$err) {
+	    my ($file, $message) = %$diag;
+	    die "cannot remove_tree '$path': $message\n" if $file eq '';
+	    die "cannot remove_tree '$path': unlinking $file failed - $message\n";
+	}
+    }
+}
+
+my sub prepare_run_dir {
+    my ($archive, $operation) = @_;
+
+    my $run_dir = "/run/pve-storage-borg-plugin/${archive}.${operation}";
+    _remove_tree($run_dir);
+    make_path($run_dir);
+    die "unable to create directory $run_dir\n" if !-d $run_dir;
+
+    return $run_dir;
+}
+
+my sub log_info {
+    my ($self, $message) = @_;
+
+    $self->{'log-function'}->('info', $message);
+}
+
+my sub log_warning {
+    my ($self, $message) = @_;
+
+    $self->{'log-function'}->('warn', $message);
+}
+
+my sub log_error {
+    my ($self, $message) = @_;
+
+    $self->{'log-function'}->('err', $message);
+}
+
+my sub file_contents_from_archive {
+    my ($self, $archive, $file) = @_;
+
+    my $run_dir = prepare_run_dir($archive, "file-contents");
+
+    my $raw;
+
+    eval {
+	local $CWD = $run_dir;
+
+	$self->{'storage-plugin'}->borg_cmd_extract(
+	    $self->{scfg},
+	    $self->{storeid},
+	    $archive,
+	    [$file],
+	);
+
+	$raw = PVE::Tools::file_get_contents("${run_dir}/${file}");
+    };
+    my $err = $@;
+    eval { _remove_tree($run_dir); };
+    log_warning($self, $@) if $@;
+    die $err if $err;
+
+    return $raw;
+}
+
+# Plugin implementation
+
+sub new {
+    my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_;
+
+    my $self = bless {
+	scfg => $scfg,
+	storeid => $storeid,
+	'storage-plugin' => $storage_plugin,
+	'log-function' => $log_function,
+    }, $class;
+
+    return $self;
+}
+
+sub provider_name {
+    my ($self) = @_;
+
+    return "Borg";
+}
+
+sub job_hook {
+    my ($self, $phase, $info) = @_;
+
+    if ($phase eq 'start') {
+	$self->{'job-id'} = $info->{'start-time'};
+	$self->{password} = $self->{'storage-plugin'}->borg_get_password(
+	    $self->{scfg}, $self->{storeid});
+	$self->{'ssh-key-fh'} = $self->{'storage-plugin'}->borg_open_ssh_key(
+	    $self->{scfg}, $self->{storeid});
+    } else {
+	delete $self->{password};
+    }
+
+    return;
+}
+
+sub backup_hook {
+    my ($self, $phase, $vmid, $vmtype, $info) = @_;
+
+    if ($phase eq 'start') {
+	$self->{$vmid}->{'task-size'} = 0;
+    } elsif ($phase eq 'prepare') {
+	if ($vmtype eq 'lxc') {
+	    my $archive = $self->{$vmid}->{archive};
+	    my $run_dir = prepare_run_dir($archive, "backup-container");
+	    $self->{$vmid}->{'run-dir'} = $run_dir;
+
+	    my $create_dir = sub {
+		my $dir = shift;
+		make_path($dir);
+		die "unable to create directory $dir\n" if !-d $dir;
+		chown($info->{'backup-user-id'}, -1, $dir)
+		    or die "unable to change owner for $dir\n";
+	    };
+
+	    $create_dir->("${run_dir}/backup/");
+	    $create_dir->("${run_dir}/backup/filesystem");
+	    $create_dir->("${run_dir}/ssh");
+	    $create_dir->("${run_dir}/.config");
+	    $create_dir->("${run_dir}/.cache");
+
+	    for my $subdir ($info->{sources}->@*) {
+		PVE::Tools::run_command([
+		    'mount',
+		    '-o', 'bind,ro',
+		    "$info->{directory}/${subdir}",
+		    "${run_dir}/backup/filesystem/${subdir}",
+		]);
+	    }
+	}
+    } elsif ($phase eq 'end' || $phase eq 'abort') {
+	if ($vmtype eq 'lxc') {
+	    my $run_dir = $self->{$vmid}->{'run-dir'};
+	    eval {
+		eval { PVE::Tools::run_command(['umount', "${run_dir}/ssh"]); };
+		eval { PVE::Tools::run_command(['umount', '-R', "${run_dir}/backup/filesystem"]); };
+		_remove_tree($run_dir);
+	    };
+	    die "unable to clean up $run_dir - $@" if $@;
+	}
+    }
+
+    return;
+}
+
+sub backup_get_mechanism {
+    my ($self, $vmid, $vmtype) = @_;
+
+    return ('block-device', undef) if $vmtype eq 'qemu';
+    return ('directory', undef) if $vmtype eq 'lxc';
+
+    die "unsupported VM type '$vmtype'\n";
+}
+
+sub backup_get_archive_name {
+    my ($self, $vmid, $vmtype, $backup_time) = @_;
+
+    return $self->{$vmid}->{archive} = archive_name($vmtype, $vmid, $backup_time);
+}
+
+sub backup_get_task_size {
+    my ($self, $vmid) = @_;
+
+    return $self->{$vmid}->{'task-size'};
+}
+
+sub backup_handle_log_file {
+    my ($self, $vmid, $filename) = @_;
+
+    return; # don't upload, Proxmox VE keeps the task log too
+}
+
+sub backup_vm {
+    my ($self, $vmid, $guest_config, $volumes, $info) = @_;
+
+    # TODO honor bandwith limit
+    # TODO discard?
+
+    my $archive = $self->{$vmid}->{archive};
+
+    my $run_dir = prepare_run_dir($archive, "backup-vm");
+    my $volume_dir = "${run_dir}/volumes";
+    make_path($volume_dir);
+    die "unable to create directory $volume_dir\n" if !-d $volume_dir;
+
+    PVE::Tools::file_set_contents("${run_dir}/guest.config", $guest_config);
+    my $paths = ['./guest.config'];
+
+    if (my $firewall_config = $info->{'firewall-config'}) {
+	PVE::Tools::file_set_contents("${run_dir}/firewall.config", $firewall_config);
+	push $paths->@*, './firewall.config';
+    }
+
+    for my $devicename (sort keys $volumes->%*) {
+	my $path = $volumes->{$devicename}->{path};
+	my $link_name = "${volume_dir}/${devicename}.raw";
+	symlink($path, $link_name) or die "could not create symlink $link_name -> $path\n";
+	push $paths->@*, "./volumes/" . basename($link_name, ());
+    }
+
+    # TODO --stats for size?
+
+    eval {
+	local $CWD = $run_dir;
+
+	$self->{'storage-plugin'}->borg_cmd_create(
+	    $self->{scfg},
+	    $self->{storeid},
+	    $self->{$vmid}->{archive},
+	    $paths,
+	    ['--read-special', '--progress'],
+	);
+    };
+    my $err = $@;
+    eval { _remove_tree($run_dir) };
+    log_warning($self, $@) if $@;
+    die $err if $err;
+}
+
+sub backup_container {
+    my ($self, $vmid, $guest_config, $exclude_patterns, $info) = @_;
+
+    # TODO honor bandwith limit
+
+    my $run_dir = $self->{$vmid}->{'run-dir'};
+    my $backup_dir = "${run_dir}/backup";
+
+    my $archive = $self->{$vmid}->{archive};
+
+    PVE::Tools::run_command(['mount', '-t', 'tmpfs', '-o', 'size=1M', 'tmpfs', "${run_dir}/ssh"]);
+
+    if ($self->{'ssh-key-fh'}) {
+	my $ssh_key =
+	    PVE::Tools::safe_read_from($self->{'ssh-key-fh'}, 1024 * 1024, 0, "SSH key file");
+	PVE::Tools::file_set_contents("${run_dir}/ssh/ssh.key", $ssh_key, 0600);
+    }
+
+    if (my $ssh_fingerprint = $self->{scfg}->{'ssh-fingerprint'}) {
+	my ($server, $port) = $self->{scfg}->@{qw(server port)};
+	$server = "[$server]" if Net::IP::ip_is_ipv6($server);
+	$server = "${server}:${port}" if $port;
+	my $fp_line = "$server $ssh_fingerprint\n";
+	PVE::Tools::file_set_contents("${run_dir}/ssh/known_hosts", $fp_line, 0600);
+    }
+
+    PVE::Tools::file_set_contents("${backup_dir}/guest.config", $guest_config);
+    my $paths = ['./guest.config'];
+
+    if (my $firewall_config = $info->{'firewall-config'}) {
+	PVE::Tools::file_set_contents("${backup_dir}/firewall.config", $firewall_config);
+	push $paths->@*, './firewall.config';
+    }
+
+    push $paths->@*, "./filesystem";
+
+    my $opts = ['--numeric-ids', '--sparse', '--progress'];
+
+    for my $pattern ($exclude_patterns->@*) {
+	if ($pattern =~ m|^/|) {
+	    push $opts->@*, '-e', "filesystem${pattern}";
+	} else {
+	    push $opts->@*, '-e', "filesystem/**${pattern}";
+	}
+    }
+
+    push $opts->@*, '-e', "filesystem/**lost+found" if $info->{'backup-user-id'} != 0;
+
+    # TODO --stats for size?
+
+    # Don't make it local to avoid permission denied error when changing back, because the method is
+    # executed in a user namespace.
+    $CWD = $backup_dir if $info->{'backup-user-id'} != 0;
+    {
+	local $CWD = $backup_dir;
+	local $ENV{BORG_BASE_DIR} = ${run_dir};
+	local $ENV{BORG_PASSPHRASE} = $self->{password};
+
+	local $ENV{BORG_RSH} =
+	    "ssh -o \"UserKnownHostsFile ${run_dir}/ssh/known_hosts\" -i ${run_dir}/ssh/ssh.key";
+
+	$self->{'storage-plugin'}->borg_cmd_create(
+	    $self->{scfg},
+	    $self->{storeid},
+	    $self->{$vmid}->{archive},
+	    $paths,
+	    $opts,
+	);
+    }
+}
+
+sub restore_get_mechanism {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $archive) = $self->{'storage-plugin'}->parse_volname($volname);
+    my ($vmtype) = $archive =~ m!^pve-([^\s-]+)!
+	or die "cannot parse guest type from archive name '$archive'\n";
+
+    return ('qemu-img', $vmtype) if $vmtype eq 'qemu';
+    return ('directory', $vmtype) if $vmtype eq 'lxc';
+
+    die "unexpected guest type '$vmtype'\n";
+}
+
+sub restore_get_guest_config {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $archive) = $self->{'storage-plugin'}->parse_volname($volname);
+    return file_contents_from_archive($self, $archive, 'guest.config');
+}
+
+sub restore_get_firewall_config {
+    my ($self, $volname, $storeid) = @_;
+
+    my (undef, $archive) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $config = eval {
+	file_contents_from_archive($self, $archive, 'firewall.config');
+    };
+    if (my $err = $@) {
+	return if $err =~ m!Include pattern 'firewall\.config' never matched\.!;
+	die $err;
+    }
+    return $config;
+}
+
+sub restore_vm_init {
+    my ($self, $volname, $storeid) = @_;
+
+    my $res = {};
+
+    my (undef, $archive, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $mount_point = prepare_run_dir($archive, "restore-vm");
+
+    $self->{'storage-plugin'}->borg_cmd_mount(
+	$self->{scfg},
+	$self->{storeid},
+	$archive,
+	$mount_point,
+    );
+
+    my @backup_files = glob("$mount_point/volumes/*");
+    for my $backup_file (@backup_files) {
+	next if $backup_file !~ m!^(.*/(.*)\.raw)$!; # untaint
+	($backup_file, my $devicename) = ($1, $2);
+	# TODO avoid dependency on base plugin?
+	$res->{$devicename}->{size} = PVE::Storage::Plugin::file_size_info($backup_file);
+    }
+
+    $self->{$volname}->{'mount-point'} = $mount_point;
+
+    return $res;
+}
+
+sub restore_vm_cleanup {
+    my ($self, $volname, $storeid) = @_;
+
+    my $mount_point = $self->{$volname}->{'mount-point'} or return;
+
+    PVE::Tools::run_command(['umount', $mount_point]);
+
+    return;
+}
+
+sub restore_vm_volume_init {
+    my ($self, $volname, $storeid, $devicename, $info) = @_;
+
+    my $mount_point = $self->{$volname}->{'mount-point'}
+	or die "expected mount point for archive not present\n";
+
+    return { 'qemu-img-path' => "${mount_point}/volumes/${devicename}.raw" };
+}
+
+sub restore_vm_volume_cleanup {
+    my ($self, $volname, $storeid, $devicename, $info) = @_;
+
+    return;
+}
+
+sub restore_container_init {
+    my ($self, $volname, $storeid, $info) = @_;
+
+    my (undef, $archive, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
+    my $mount_point = prepare_run_dir($archive, "restore-container");
+
+    $self->{'storage-plugin'}->borg_cmd_mount(
+	$self->{scfg},
+	$self->{storeid},
+	$archive,
+	$mount_point,
+    );
+
+    $self->{$volname}->{'mount-point'} = $mount_point;
+
+    return { 'archive-directory' => "${mount_point}/filesystem" };
+}
+
+sub restore_container_cleanup {
+    my ($self, $volname, $storeid, $info) = @_;
+
+    my $mount_point = $self->{$volname}->{'mount-point'} or return;
+
+    PVE::Tools::run_command(['umount', $mount_point]);
+
+    return;
+}
+
+1;
diff --git a/src/PVE/BackupProvider/Plugin/Makefile b/src/PVE/BackupProvider/Plugin/Makefile
index bedc26e..db08c2d 100644
--- a/src/PVE/BackupProvider/Plugin/Makefile
+++ b/src/PVE/BackupProvider/Plugin/Makefile
@@ -1,4 +1,4 @@
-SOURCES = Base.pm DirectoryExample.pm
+SOURCES = Base.pm Borg.pm DirectoryExample.pm
 
 .PHONY: install
 install:
diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
index 9f9a86b..f4bfc55 100755
--- a/src/PVE/Storage.pm
+++ b/src/PVE/Storage.pm
@@ -40,6 +40,7 @@ use PVE::Storage::ZFSPlugin;
 use PVE::Storage::PBSPlugin;
 use PVE::Storage::BTRFSPlugin;
 use PVE::Storage::ESXiPlugin;
+use PVE::Storage::BorgBackupPlugin;
 
 # Storage API version. Increment it on changes in storage API interface.
 use constant APIVER => 11;
@@ -66,6 +67,7 @@ PVE::Storage::ZFSPlugin->register();
 PVE::Storage::PBSPlugin->register();
 PVE::Storage::BTRFSPlugin->register();
 PVE::Storage::ESXiPlugin->register();
+PVE::Storage::BorgBackupPlugin->register();
 
 # load third-party plugins
 if ( -d '/usr/share/perl5/PVE/Storage/Custom' ) {
diff --git a/src/PVE/Storage/BorgBackupPlugin.pm b/src/PVE/Storage/BorgBackupPlugin.pm
new file mode 100644
index 0000000..8f0e721
--- /dev/null
+++ b/src/PVE/Storage/BorgBackupPlugin.pm
@@ -0,0 +1,595 @@
+package PVE::Storage::BorgBackupPlugin;
+
+use strict;
+use warnings;
+
+use Fcntl qw(F_GETFD F_SETFD FD_CLOEXEC);
+use JSON qw(from_json);
+use MIME::Base64 qw(decode_base64);
+use Net::IP;
+use POSIX;
+
+use PVE::BackupProvider::Plugin::Borg;
+use PVE::Tools;
+
+use base qw(PVE::Storage::Plugin);
+
+my sub borg_repository_uri {
+    my ($scfg, $storeid) = @_;
+
+    my $uri = '';
+    my $server = $scfg->{server} or die "no server configured for $storeid\n";
+    my $username = $scfg->{username} or die "no username configured for $storeid\n";
+    my $prefix = "ssh://$username@";
+    $server = "[$server]" if Net::IP::ip_is_ipv6($server);
+    if (my $port = $scfg->{port}) {
+	$uri = "${prefix}${server}:${port}";
+    } else {
+	$uri = "${prefix}${server}";
+    }
+    $uri .= $scfg->{'repository-path'};
+
+    return $uri;
+}
+
+my sub borg_password_file_name {
+    my ($scfg, $storeid) = @_;
+
+    return "/etc/pve/priv/storage/${storeid}.pw";
+}
+
+my sub borg_set_password {
+    my ($scfg, $storeid, $password) = @_;
+
+    my $pwfile = borg_password_file_name($scfg, $storeid);
+    mkdir "/etc/pve/priv/storage";
+
+    PVE::Tools::file_set_contents($pwfile, "$password\n");
+}
+
+my sub borg_delete_password {
+    my ($scfg, $storeid) = @_;
+
+    my $pwfile = borg_password_file_name($scfg, $storeid);
+
+    unlink $pwfile;
+}
+
+sub borg_get_password {
+    my ($class, $scfg, $storeid) = @_;
+
+    my $pwfile = borg_password_file_name($scfg, $storeid);
+
+    return PVE::Tools::file_read_firstline($pwfile);
+}
+
+sub borg_cmd_list {
+    my ($class, $scfg, $storeid) = @_;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+
+    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
+	if !$ENV{BORG_PASSPHRASE};
+
+    my $json = '';
+    my $cmd = ['borg', 'list', '--json', $uri];
+
+    my $errfunc = sub { warn $_[0]; };
+    my $outfunc = sub { $json .= $_[0]; };
+
+    PVE::Tools::run_command(
+	$cmd, errmsg => "command @$cmd failed", outfunc => $outfunc, errfunc => $errfunc);
+
+    my $res = eval { from_json($json) };
+    die "unable to parse 'borg list' output - $@\n" if $@;
+    return $res;
+}
+
+sub borg_cmd_create {
+    my ($class, $scfg, $storeid, $archive, $paths, $opts) = @_;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+
+    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
+	if !$ENV{BORG_PASSPHRASE};
+
+    my $cmd = ['borg', 'create', $opts->@*, "${uri}::${archive}", $paths->@*];
+
+    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
+
+    return;
+}
+
+sub borg_cmd_extract {
+    my ($class, $scfg, $storeid, $archive, $paths) = @_;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+
+    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
+	if !$ENV{BORG_PASSPHRASE};
+
+    my $cmd = ['borg', 'extract', "${uri}::${archive}", $paths->@*];
+
+    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
+
+    return;
+}
+
+sub borg_cmd_delete {
+    my ($class, $scfg, $storeid, $archive) = @_;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+
+    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
+	if !$ENV{BORG_PASSPHRASE};
+
+    my $cmd = ['borg', 'delete', "${uri}::${archive}"];
+
+    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
+
+    return;
+}
+
+sub borg_cmd_info {
+    my ($class, $scfg, $storeid, $archive, $timeout) = @_;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+
+    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
+	if !$ENV{BORG_PASSPHRASE};
+
+    my $json = '';
+    my $cmd = ['borg', 'info', '--json', "${uri}::${archive}"];
+
+    my $errfunc = sub { warn $_[0]; };
+    my $outfunc = sub { $json .= $_[0]; };
+
+    PVE::Tools::run_command(
+	$cmd,
+	errmsg => "command @$cmd failed",
+	timeout => $timeout,
+	outfunc => $outfunc,
+	errfunc => $errfunc,
+    );
+
+    my $res = eval { from_json($json) };
+    die "unable to parse 'borg info' output for archive '$archive' - $@\n" if $@;
+    return $res;
+}
+
+sub borg_cmd_mount {
+    my ($class, $scfg, $storeid, $archive, $mount_point) = @_;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+
+    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
+	if !$ENV{BORG_PASSPHRASE};
+
+    my $cmd = ['borg', 'mount', "${uri}::${archive}", $mount_point];
+
+    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
+
+    return;
+}
+
+my sub parse_backup_time {
+    my ($time_string) = @_;
+
+    my @tm = (POSIX::strptime($time_string, "%FT%TZ"));
+    # expect sec, min, hour, mday, mon, year
+    if (grep { !defined($_) } @tm[0..5]) {
+	warn "error parsing time from string '$time_string'\n";
+	return 0;
+    } else {
+	local $ENV{TZ} = 'UTC'; # time string is UTC
+
+	# Fill in isdst to avoid undef warning. No daylight saving time for UTC.
+	$tm[8] //= 0;
+
+	if (my $since_epoch = mktime(@tm)) {
+	    return int($since_epoch);
+	} else {
+	    warn "error parsing time from string '$time_string'\n";
+	    return 0;
+	}
+    }
+}
+
+# Helpers
+
+sub type {
+    return 'borg';
+}
+
+sub plugindata {
+    return {
+	content => [ { backup => 1, none => 1 }, { backup => 1 } ],
+	features => { 'backup-provider' => 1 },
+    };
+}
+
+sub properties {
+    return {
+	'repository-path' => {
+	    description => "Path to the backup repository",
+	    type => 'string',
+	},
+	'ssh-key' => {
+	    description => "FIXME", # FIXME
+	    type => 'string',
+	},
+	'ssh-fingerprint' => {
+	    description => "FIXME", # FIXME
+	    type => 'string',
+	},
+    };
+}
+
+sub options {
+    return {
+	'repository-path' => { fixed => 1 },
+	server => { fixed => 1 },
+	port => { optional => 1 },
+	username => { fixed => 1 },
+	'ssh-key' => { optional => 1 },
+	'ssh-fingerprint' => { optional => 1 },
+	password => { optional => 1 },
+	disable => { optional => 1 },
+	nodes => { optional => 1 },
+	'prune-backups' => { optional => 1 },
+	'max-protected-backups' => { optional => 1 },
+    };
+}
+
+sub borg_ssh_key_file_name {
+    my ($scfg, $storeid) = @_;
+
+    return "/etc/pve/priv/storage/${storeid}.ssh.key";
+}
+
+sub borg_set_ssh_key {
+    my ($scfg, $storeid, $key) = @_;
+
+    my $pwfile = borg_ssh_key_file_name($scfg, $storeid);
+    mkdir "/etc/pve/priv/storage";
+
+    PVE::Tools::file_set_contents($pwfile, "$key\n");
+}
+
+sub borg_delete_ssh_key {
+    my ($scfg, $storeid) = @_;
+
+    my $pwfile = borg_ssh_key_file_name($scfg, $storeid);
+
+    if (!unlink $pwfile) {
+	return if $! == ENOENT;
+	die "failed to delete SSH key! $!\n";
+    }
+    delete $scfg->{'ssh-key'};
+}
+
+sub borg_get_ssh_key {
+    my ($scfg, $storeid) = @_;
+
+    my $pwfile = borg_ssh_key_file_name($scfg, $storeid);
+
+    return PVE::Tools::file_get_contents($pwfile);
+}
+
+# Returns a file handle with FD_CLOEXEC disabled if there is an SSH key , or `undef` if there is
+# not. Dies on error.
+sub borg_open_ssh_key {
+    my ($self, $scfg, $storeid) = @_;
+
+    my $ssh_key_file = borg_ssh_key_file_name($scfg, $storeid);
+
+    my $keyfd;
+    if (!open($keyfd, '<', $ssh_key_file)) {
+	if ($! == ENOENT) {
+	    die "SSH key configured but no key file found!\n" if $scfg->{'ssh-key'};
+	    return undef;
+	}
+	die "failed to open SSH key: $ssh_key_file: $!\n";
+    }
+    my $flags = fcntl($keyfd, F_GETFD, 0)
+	// die "failed to get file descriptor flags for SSH key FD: $!\n";
+    fcntl($keyfd, F_SETFD, $flags & ~FD_CLOEXEC)
+	or die "failed to remove FD_CLOEXEC from SSH key file descriptor\n";
+
+    return $keyfd;
+}
+
+# Storage implementation
+
+sub on_add_hook {
+    my ($class, $storeid, $scfg, %param) = @_;
+
+    if (defined(my $password = $param{password})) {
+	borg_set_password($scfg, $storeid, $password);
+    } else {
+	borg_delete_password($scfg, $storeid);
+    }
+
+    if (defined(my $ssh_key = delete $param{'ssh-key'})) {
+	my $decoded = decode_base64($ssh_key);
+	borg_set_ssh_key($scfg, $storeid, $decoded);
+	$scfg->{'ssh-key'} = 1;
+    } else {
+	borg_delete_ssh_key($scfg, $storeid);
+    }
+
+    return;
+}
+
+sub on_update_hook {
+    my ($class, $storeid, $scfg, %param) = @_;
+
+    if (exists($param{password})) {
+	if (defined($param{password})) {
+	    borg_set_password($scfg, $storeid, $param{password});
+	} else {
+	    borg_delete_password($scfg, $storeid);
+	}
+    }
+
+    if (exists($param{'ssh-key'})) {
+	if (defined(my $ssh_key = delete($param{'ssh-key'}))) {
+	    my $decoded = decode_base64($ssh_key);
+
+	    borg_set_ssh_key($scfg, $storeid, $decoded);
+	    $scfg->{'ssh-key'} = 1;
+	} else {
+	    borg_delete_ssh_key($scfg, $storeid);
+	}
+    }
+
+    return;
+}
+
+sub on_delete_hook {
+    my ($class, $storeid, $scfg) = @_;
+
+    borg_delete_password($scfg, $storeid);
+    borg_delete_ssh_key($scfg, $storeid);
+
+    return;
+}
+
+sub prune_backups {
+    my ($class, $scfg, $storeid, $keep, $vmid, $type, $dryrun, $logfunc) = @_;
+
+    # FIXME - is 'borg prune' compatible with ours?
+    die "not implemented";
+}
+
+sub parse_volname {
+    my ($class, $volname) = @_;
+
+    if ($volname =~ m!^backup/(.*)$!) {
+	my $archive = $1;
+	if ($archive =~ $PVE::BackupProvider::Plugin::Borg::ARCHIVE_RE_3) {
+	    return ('backup', $archive, $2);
+	}
+    }
+
+    die "unable to parse Borg volume name '$volname'\n";
+}
+
+sub path {
+    my ($class, $scfg, $volname, $storeid, $snapname) = @_;
+
+    die "volume snapshot is not possible on Borg volume" if $snapname;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+    my (undef, $archive) = $class->parse_volname($volname);
+
+    return "${uri}::${archive}";
+}
+
+sub create_base {
+    my ($class, $storeid, $scfg, $volname) = @_;
+
+    die "cannot create base image in Borg storage\n";
+}
+
+sub clone_image {
+    my ($class, $scfg, $storeid, $volname, $vmid, $snap) = @_;
+
+    die "can't clone images in Borg storage\n";
+}
+
+sub alloc_image {
+    my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
+
+    die "can't allocate space in Borg storage\n";
+}
+
+sub free_image {
+    my ($class, $storeid, $scfg, $volname, $isBase) = @_;
+
+    my (undef, $archive) = $class->parse_volname($volname);
+
+    borg_cmd_delete($class, $scfg, $storeid, $archive);
+
+    return;
+}
+
+sub list_images {
+    my ($class, $storeid, $scfg, $vmid, $vollist, $cache) = @_;
+
+    return []; # guest images are not supported, only backups
+}
+
+sub list_volumes {
+    my ($class, $storeid, $scfg, $vmid, $content_types) = @_;
+
+    my $res = [];
+
+    return $res if !grep { $_ eq 'backup' } $content_types->@*;
+
+    my $archives = $class->borg_cmd_list($scfg, $storeid)->{archives}
+	or die "expected 'archives' key in 'borg list' JSON output missing\n";
+
+    for my $info ($archives->@*) {
+	my $archive = $info->{archive};
+	my ($vmtype, $backup_vmid, $time_string) =
+	    $archive =~ $PVE::BackupProvider::Plugin::Borg::ARCHIVE_RE_3 or next;
+
+	next if defined($vmid) && $vmid != $backup_vmid;
+
+	push $res->@*, {
+	    volid => "${storeid}:backup/${archive}",
+	    size => 0, # FIXME how to cheaply get?
+	    content => 'backup',
+	    ctime => parse_backup_time($time_string),
+	    vmid => $backup_vmid,
+	    format => "borg-archive",
+	    subtype => $vmtype,
+	}
+    }
+
+    return $res;
+}
+
+sub status {
+    my ($class, $storeid, $scfg, $cache) = @_;
+
+    my $uri = borg_repository_uri($scfg, $storeid);
+
+    my $res;
+
+    if ($uri =~ m!^ssh://!) {
+	#FIXME ssh and df on target?
+	return;
+    } else { # $uri is a local path
+	my $timeout = 2;
+	$res = PVE::Tools::df($uri, $timeout);
+
+	return if !$res || !$res->{total};
+    }
+
+
+    return ($res->{total}, $res->{avail}, $res->{used}, 1);
+}
+
+sub activate_storage {
+    my ($class, $storeid, $scfg, $cache) = @_;
+
+    # TODO how to cheaply check? split ssh and non-ssh?
+
+    return 1;
+}
+
+sub deactivate_storage {
+    my ($class, $storeid, $scfg, $cache) = @_;
+
+    return 1;
+}
+
+sub activate_volume {
+    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
+
+    die "volume snapshot is not possible on Borg volume" if $snapname;
+
+    return 1;
+}
+
+sub deactivate_volume {
+    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
+
+    die "volume snapshot is not possible on Borg volume" if $snapname;
+
+    return 1;
+}
+
+sub get_volume_attribute {
+    my ($class, $scfg, $storeid, $volname, $attribute) = @_;
+
+    return;
+}
+
+sub update_volume_attribute {
+    my ($class, $scfg, $storeid, $volname, $attribute, $value) = @_;
+
+    # FIXME notes or protected possible?
+
+    die "attribute '$attribute' is not supported on Borg volume";
+}
+
+sub volume_size_info {
+    my ($class, $scfg, $storeid, $volname, $timeout) = @_;
+
+    my (undef, $archive) = $class->parse_volname($volname);
+    my (undef, undef, $time_string) =
+	$archive =~ $PVE::BackupProvider::Plugin::Borg::ARCHIVE_RE_3;
+
+    my $backup_time = 0;
+    if ($time_string) {
+	$backup_time = parse_backup_time($time_string)
+    } else {
+	warn "could not parse time from archive name '$archive'\n";
+    }
+
+    my $archives = borg_cmd_info($class, $scfg, $storeid, $archive, $timeout)->{archives}
+	or die "expected 'archives' key in 'borg info' JSON output missing\n";
+
+    my $stats = eval { $archives->[0]->{stats} }
+	or die "expected entry in 'borg info' JSON output missing\n";
+    my ($size, $used) = $stats->@{qw(original_size deduplicated_size)};
+
+    ($size) = ($size =~ /^(\d+)$/); # untaint
+    die "size '$size' not an integer\n" if !defined($size);
+    # coerce back from string
+    $size = int($size);
+    ($used) = ($used =~ /^(\d+)$/); # untaint
+    die "used '$used' not an integer\n" if !defined($used);
+    # coerce back from string
+    $used = int($used);
+
+    return wantarray ? ($size, 'borg-archive', $used, undef, $backup_time) : $size;
+}
+
+sub volume_resize {
+    my ($class, $scfg, $storeid, $volname, $size, $running) = @_;
+
+    die "volume resize is not possible on Borg volume";
+}
+
+sub volume_snapshot {
+    my ($class, $scfg, $storeid, $volname, $snap) = @_;
+
+    die "volume snapshot is not possible on Borg volume";
+}
+
+sub volume_snapshot_rollback {
+    my ($class, $scfg, $storeid, $volname, $snap) = @_;
+
+    die "volume snapshot rollback is not possible on Borg volume";
+}
+
+sub volume_snapshot_delete {
+    my ($class, $scfg, $storeid, $volname, $snap) = @_;
+
+    die "volume snapshot delete is not possible on Borg volume";
+}
+
+sub volume_has_feature {
+    my ($class, $scfg, $feature, $storeid, $volname, $snapname, $running) = @_;
+
+    return 0;
+}
+
+sub rename_volume {
+    my ($class, $scfg, $storeid, $source_volname, $target_vmid, $target_volname) = @_;
+
+    die "volume rename is not implemented in Borg storage plugin\n";
+}
+
+sub new_backup_provider {
+    my ($class, $scfg, $storeid, $bandwidth_limit, $log_function) = @_;
+
+    return PVE::BackupProvider::Plugin::Borg->new(
+	$class, $scfg, $storeid, $bandwidth_limit, $log_function);
+}
+
+1;
diff --git a/src/PVE/Storage/Makefile b/src/PVE/Storage/Makefile
index acd37f4..9fe2c66 100644
--- a/src/PVE/Storage/Makefile
+++ b/src/PVE/Storage/Makefile
@@ -14,6 +14,7 @@ SOURCES= \
 	PBSPlugin.pm \
 	BTRFSPlugin.pm \
 	LvmThinPlugin.pm \
+	BorgBackupPlugin.pm \
 	ESXiPlugin.pm
 
 .PHONY: install
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu-server v3 16/34] move nbd_stop helper to QMPHelpers module
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (14 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [POC storage v3 15/34] WIP Borg plugin Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-11 13:55   ` [pve-devel] applied: " Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 17/34] backup: move cleanup of fleecing images to cleanup method Fiona Ebner
                   ` (18 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Like this nbd_stop() can be called from a module that cannot include
QemuServer.pm.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 PVE/API2/Qemu.pm             | 3 ++-
 PVE/CLI/qm.pm                | 3 ++-
 PVE/QemuServer.pm            | 6 ------
 PVE/QemuServer/QMPHelpers.pm | 6 ++++++
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 848001b6..1c3cb271 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -35,6 +35,7 @@ use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::Machine;
 use PVE::QemuServer::Memory qw(get_current_memory);
 use PVE::QemuServer::PCI;
+use PVE::QemuServer::QMPHelpers;
 use PVE::QemuServer::USB;
 use PVE::QemuMigrate;
 use PVE::RPCEnvironment;
@@ -5910,7 +5911,7 @@ __PACKAGE__->register_method({
 		    return;
 		},
 		'nbdstop' => sub {
-		    PVE::QemuServer::nbd_stop($state->{vmid});
+		    PVE::QemuServer::QMPHelpers::nbd_stop($state->{vmid});
 		    return;
 		},
 		'resume' => sub {
diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
index 8d8ce10a..47b87782 100755
--- a/PVE/CLI/qm.pm
+++ b/PVE/CLI/qm.pm
@@ -35,6 +35,7 @@ use PVE::QemuServer::Agent qw(agent_available);
 use PVE::QemuServer::ImportDisk;
 use PVE::QemuServer::Monitor qw(mon_cmd);
 use PVE::QemuServer::OVF;
+use PVE::QemuServer::QMPHelpers;
 use PVE::QemuServer;
 
 use PVE::CLIHandler;
@@ -385,7 +386,7 @@ __PACKAGE__->register_method ({
 
 	my $vmid = $param->{vmid};
 
-	eval { PVE::QemuServer::nbd_stop($vmid) };
+	eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid) };
 	warn $@ if $@;
 
 	return;
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 0df3bda0..49b6ca17 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -8606,12 +8606,6 @@ sub generate_smbios1_uuid {
     return "uuid=".generate_uuid();
 }
 
-sub nbd_stop {
-    my ($vmid) = @_;
-
-    mon_cmd($vmid, 'nbd-server-stop', timeout => 25);
-}
-
 sub create_reboot_request {
     my ($vmid) = @_;
     open(my $fh, '>', "/run/qemu-server/$vmid.reboot")
diff --git a/PVE/QemuServer/QMPHelpers.pm b/PVE/QemuServer/QMPHelpers.pm
index 0269ea46..826938de 100644
--- a/PVE/QemuServer/QMPHelpers.pm
+++ b/PVE/QemuServer/QMPHelpers.pm
@@ -15,6 +15,12 @@ qemu_objectadd
 qemu_objectdel
 );
 
+sub nbd_stop {
+    my ($vmid) = @_;
+
+    mon_cmd($vmid, 'nbd-server-stop', timeout => 25);
+}
+
 sub qemu_deviceadd {
     my ($vmid, $devicefull) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu-server v3 17/34] backup: move cleanup of fleecing images to cleanup method
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (15 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 16/34] move nbd_stop helper to QMPHelpers module Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12  9:26   ` [pve-devel] applied: " Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 18/34] backup: cleanup: check if VM is running before issuing QMP commands Fiona Ebner
                   ` (17 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

TPM drives are already detached there and it's better to group
these things together.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 PVE/VZDump/QemuServer.pm | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index 012c9210..b2ced154 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -690,7 +690,6 @@ sub archive_pbs {
 
     # get list early so we die on unkown drive types before doing anything
     my $devlist = _get_task_devlist($task);
-    my $use_fleecing;
 
     $self->enforce_vm_running_for_backup($vmid);
     $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
@@ -721,7 +720,7 @@ sub archive_pbs {
 
 	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
 
-	$use_fleecing = check_and_prepare_fleecing(
+	$task->{'use-fleecing'} = check_and_prepare_fleecing(
 	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
 
 	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
@@ -735,7 +734,7 @@ sub archive_pbs {
 	    devlist => $devlist,
 	    'config-file' => $conffile,
 	};
-	$params->{fleecing} = JSON::true if $use_fleecing;
+	$params->{fleecing} = JSON::true if $task->{'use-fleecing'};
 
 	if (defined(my $ns = $scfg->{namespace})) {
 	    $params->{'backup-ns'} = $ns;
@@ -784,11 +783,6 @@ sub archive_pbs {
     }
     $self->restore_vm_power_state($vmid);
 
-    if ($use_fleecing) {
-	detach_fleecing_images($task->{disks}, $vmid);
-	cleanup_fleecing_images($self, $task->{disks});
-    }
-
     die $err if $err;
 }
 
@@ -891,7 +885,6 @@ sub archive_vma {
     }
 
     my $devlist = _get_task_devlist($task);
-    my $use_fleecing;
 
     $self->enforce_vm_running_for_backup($vmid);
     $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
@@ -911,7 +904,7 @@ sub archive_vma {
 
 	$attach_tpmstate_drive->($self, $task, $vmid);
 
-	$use_fleecing = check_and_prepare_fleecing(
+	$task->{'use-fleecing'} = check_and_prepare_fleecing(
 	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
 
 	my $outfh;
@@ -942,7 +935,7 @@ sub archive_vma {
 		devlist => $devlist
 	    };
 	    $params->{'firewall-file'} = $firewall if -e $firewall;
-	    $params->{fleecing} = JSON::true if $use_fleecing;
+	    $params->{fleecing} = JSON::true if $task->{'use-fleecing'};
 	    add_backup_performance_options($params, $opts->{performance}, $qemu_support);
 
 	    $qmpclient->queue_cmd($vmid, $backup_cb, 'backup', %$params);
@@ -984,11 +977,6 @@ sub archive_vma {
 
     $self->restore_vm_power_state($vmid);
 
-    if ($use_fleecing) {
-	detach_fleecing_images($task->{disks}, $vmid);
-	cleanup_fleecing_images($self, $task->{disks});
-    }
-
     if ($err) {
 	if ($cpid) {
 	    kill(9, $cpid);
@@ -1132,6 +1120,11 @@ sub cleanup {
 
     $detach_tpmstate_drive->($task, $vmid);
 
+    if ($task->{'use-fleecing'}) {
+	detach_fleecing_images($task->{disks}, $vmid);
+	cleanup_fleecing_images($self, $task->{disks});
+    }
+
     if ($self->{qmeventd_fh}) {
 	close($self->{qmeventd_fh});
     }
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu-server v3 18/34] backup: cleanup: check if VM is running before issuing QMP commands
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (16 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 17/34] backup: move cleanup of fleecing images to cleanup method Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12  9:26   ` [pve-devel] applied: " Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 19/34] backup: keep track of block-node size for fleecing Fiona Ebner
                   ` (16 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

When the VM is only started for backup, the VM will be stopped at that
point again. While the detach helpers do not warn about errors
currently, that might change in the future. This is also in
preparation for other cleanup QMP helpers that are more verbose about
failure.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 PVE/VZDump/QemuServer.pm | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index b2ced154..c46e607c 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -1118,13 +1118,14 @@ sub snapshot {
 sub cleanup {
     my ($self, $task, $vmid) = @_;
 
-    $detach_tpmstate_drive->($task, $vmid);
-
-    if ($task->{'use-fleecing'}) {
-	detach_fleecing_images($task->{disks}, $vmid);
-	cleanup_fleecing_images($self, $task->{disks});
+    # If VM was started only for backup, it is already stopped now.
+    if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
+	$detach_tpmstate_drive->($task, $vmid);
+	detach_fleecing_images($task->{disks}, $vmid) if $task->{'use-fleecing'};
     }
 
+    cleanup_fleecing_images($self, $task->{disks}) if $task->{'use-fleecing'};
+
     if ($self->{qmeventd_fh}) {
 	close($self->{qmeventd_fh});
     }
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu-server v3 19/34] backup: keep track of block-node size for fleecing
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (17 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 18/34] backup: cleanup: check if VM is running before issuing QMP commands Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-11 14:22   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 20/34] backup: allow adding fleecing images also for EFI and TPM Fiona Ebner
                   ` (15 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

For fleecing, the size needs to match exactly what QEMU sees. In
particular, EFI disks might be attached with a 'size=' option, meaning
that size can be different from the volume's size. Commit 36377acf
("backup: disk info: also keep track of size") introduced size
tracking and it was used for fleecing since then, but the accurate
size information needs to be queried via QMP.

Should also help with the following issue reported in the community
forum:
https://forum.proxmox.com/threads/152202

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* only use query-block QMP command after the VM is enforced running

 PVE/VZDump/QemuServer.pm | 37 ++++++++++++++++++++++++++++++++-----
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index c46e607c..1ebafe6d 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -551,7 +551,7 @@ my sub allocate_fleecing_images {
 		my $name = "vm-$vmid-fleece-$n";
 		$name .= ".$format" if $scfg->{path};
 
-		my $size = PVE::Tools::convert_size($di->{size}, 'b' => 'kb');
+		my $size = PVE::Tools::convert_size($di->{'block-node-size'}, 'b' => 'kb');
 
 		$di->{'fleece-volid'} = PVE::Storage::vdisk_alloc(
 		    $self->{storecfg}, $fleecing_storeid, $vmid, $format, $name, $size);
@@ -600,7 +600,7 @@ my sub attach_fleecing_images {
 	    my $drive = "file=$path,if=none,id=$devid,format=$format,discard=unmap";
 	    # Specify size explicitly, to make it work if storage backend rounded up size for
 	    # fleecing image when allocating.
-	    $drive .= ",size=$di->{size}" if $format eq 'raw';
+	    $drive .= ",size=$di->{'block-node-size'}" if $format eq 'raw';
 	    $drive =~ s/\\/\\\\/g;
 	    my $ret = PVE::QemuServer::Monitor::hmp_cmd($vmid, "drive_add auto \"$drive\"", 60);
 	    die "attaching fleecing image $volid failed - $ret\n" if $ret !~ m/OK/s;
@@ -609,7 +609,7 @@ my sub attach_fleecing_images {
 }
 
 my sub check_and_prepare_fleecing {
-    my ($self, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = @_;
+    my ($self, $task, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = @_;
 
     # Even if the VM was started specifically for fleecing, it's possible that the VM is resumed and
     # then starts doing IO. For VMs that are not resumed the fleecing images will just stay empty,
@@ -626,6 +626,8 @@ my sub check_and_prepare_fleecing {
     }
 
     if ($use_fleecing) {
+	$self->query_block_node_sizes($vmid, $task);
+
 	my ($default_format, $valid_formats) = PVE::Storage::storage_default_format(
 	    $self->{storecfg}, $fleecing_opts->{storage});
 	my $format = scalar(grep { $_ eq 'qcow2' } $valid_formats->@*) ? 'qcow2' : 'raw';
@@ -721,7 +723,7 @@ sub archive_pbs {
 	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
 
 	$task->{'use-fleecing'} = check_and_prepare_fleecing(
-	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
+	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
 
 	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
 
@@ -905,7 +907,7 @@ sub archive_vma {
 	$attach_tpmstate_drive->($self, $task, $vmid);
 
 	$task->{'use-fleecing'} = check_and_prepare_fleecing(
-	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
+	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
 
 	my $outfh;
 	if ($opts->{stdout}) {
@@ -1042,6 +1044,31 @@ sub qga_fs_thaw {
     $self->logerr($@) if $@;
 }
 
+# The size for fleecing images needs to be exactly the same size as QEMU sees. E.g. EFI disk can bex
+# attached with a smaller size then the underyling image on the storage.
+sub query_block_node_sizes {
+    my ($self, $vmid, $task) = @_;
+
+    my $block_info = mon_cmd($vmid, "query-block");
+    $block_info = { map { $_->{device} => $_ } $block_info->@* };
+
+    for my $diskinfo ($task->{disks}->@*) {
+	my $drive_key = $diskinfo->{virtdev};
+	$drive_key .= "-backup" if $drive_key eq 'tpmstate0';
+	my $block_node_size =
+	    eval { $block_info->{"drive-$drive_key"}->{inserted}->{image}->{'virtual-size'}; };
+	if (!$block_node_size) {
+	    $self->loginfo(
+		"could not determine block node size of drive '$drive_key' - using fallback");
+	    $block_node_size = $diskinfo->{size}
+		or die "could not determine size of drive '$drive_key'\n";
+	}
+	$diskinfo->{'block-node-size'} = $block_node_size;
+    }
+
+    return;
+}
+
 # we need a running QEMU/KVM process for backup, starts a paused (prelaunch)
 # one if VM isn't already running
 sub enforce_vm_running_for_backup {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu-server v3 20/34] backup: allow adding fleecing images also for EFI and TPM
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (18 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 19/34] backup: keep track of block-node size for fleecing Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12  9:26   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers Fiona Ebner
                   ` (14 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

For the external backup API, it will be necessary to add a fleecing
image even for small disks like EFI and TPM, because there is no other
place the old data could be copied to when a new guest write comes in.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* adapt to context changes from previous patch

 PVE/VZDump/QemuServer.pm | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index 1ebafe6d..b6dcd6cc 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -534,7 +534,7 @@ my sub cleanup_fleecing_images {
 }
 
 my sub allocate_fleecing_images {
-    my ($self, $disks, $vmid, $fleecing_storeid, $format) = @_;
+    my ($self, $disks, $vmid, $fleecing_storeid, $format, $all_images) = @_;
 
     die "internal error - no fleecing storage specified\n" if !$fleecing_storeid;
 
@@ -545,7 +545,8 @@ my sub allocate_fleecing_images {
 	my $n = 0; # counter for fleecing image names
 
 	for my $di ($disks->@*) {
-	    next if $di->{virtdev} =~ m/^(?:tpmstate|efidisk)\d$/; # too small to be worth it
+	    # EFI/TPM are usually too small to be worth it, but it's required for external providers
+	    next if !$all_images && $di->{virtdev} =~ m/^(?:tpmstate|efidisk)\d$/;
 	    if ($di->{type} eq 'block' || $di->{type} eq 'file') {
 		my $scfg = PVE::Storage::storage_config($self->{storecfg}, $fleecing_storeid);
 		my $name = "vm-$vmid-fleece-$n";
@@ -609,7 +610,7 @@ my sub attach_fleecing_images {
 }
 
 my sub check_and_prepare_fleecing {
-    my ($self, $task, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = @_;
+    my ($self, $task, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support, $all_images) = @_;
 
     # Even if the VM was started specifically for fleecing, it's possible that the VM is resumed and
     # then starts doing IO. For VMs that are not resumed the fleecing images will just stay empty,
@@ -632,7 +633,8 @@ my sub check_and_prepare_fleecing {
 	    $self->{storecfg}, $fleecing_opts->{storage});
 	my $format = scalar(grep { $_ eq 'qcow2' } $valid_formats->@*) ? 'qcow2' : 'raw';
 
-	allocate_fleecing_images($self, $disks, $vmid, $fleecing_opts->{storage}, $format);
+	allocate_fleecing_images(
+	    $self, $disks, $vmid, $fleecing_opts->{storage}, $format, $all_images);
 	attach_fleecing_images($self, $disks, $vmid, $format);
     }
 
@@ -723,7 +725,7 @@ sub archive_pbs {
 	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
 
 	$task->{'use-fleecing'} = check_and_prepare_fleecing(
-	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
+	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 0);
 
 	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
 
@@ -907,7 +909,7 @@ sub archive_vma {
 	$attach_tpmstate_drive->($self, $task, $vmid);
 
 	$task->{'use-fleecing'} = check_and_prepare_fleecing(
-	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
+	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 0);
 
 	my $outfh;
 	if ($opts->{stdout}) {
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (19 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 20/34] backup: allow adding fleecing images also for EFI and TPM Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12 12:27   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 22/34] restore: die early when there is no size for a device Fiona Ebner
                   ` (13 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

The state of the VM's disk images at the time the backup is started is
preserved via a snapshot-access block node. Old data is moved to the
fleecing image when new guest writes come in. The snapshot-access
block node, as well as the associated bitmap in case of incremental
backup, will be made available to the external provider. They are
exported via NBD and for 'nbd' mechanism, the NBD socket path is
passed to the provider, while for 'block-device' mechanism, the NBD
export is made accessible as a regular block device first and the
bitmap information is made available via a $next_dirty_region->()
function. For 'block-device', the 'nbdinfo' binary is required.

The provider can indicate that it wants to do an incremental backup by
returning the bitmap ID that was used for a previous backup and it
will then be told if the bitmap was newly created (either first backup
or old bitmap was invalid) or if the bitmap can be reused.

The provider then reads the parts of the NBD or block device it needs,
either the full disk for full backup, or the dirty parts according to
the bitmap for incremental backup. The bitmap has to be respected,
reads to other parts of the image will return an error. After backing
up each part of the disk, it should be discarded in the export to
avoid unnecessary space usage in the fleecing image (requires the
storage underlying the fleecing image to support discard too).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* adapt to API changes, config files are now passed as raw

 PVE/VZDump/QemuServer.pm | 309 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 308 insertions(+), 1 deletion(-)

diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index b6dcd6cc..d0218c9b 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -20,7 +20,7 @@ use PVE::QMPClient;
 use PVE::Storage::Plugin;
 use PVE::Storage::PBSPlugin;
 use PVE::Storage;
-use PVE::Tools;
+use PVE::Tools qw(run_command);
 use PVE::VZDump;
 use PVE::Format qw(render_duration render_bytes);
 
@@ -277,6 +277,8 @@ sub archive {
 
     if ($self->{vzdump}->{opts}->{pbs}) {
 	$self->archive_pbs($task, $vmid);
+    } elsif ($self->{vzdump}->{'backup-provider'}) {
+	$self->archive_external($task, $vmid);
     } else {
 	$self->archive_vma($task, $vmid, $filename, $comp);
     }
@@ -1149,6 +1151,23 @@ sub cleanup {
 
     # If VM was started only for backup, it is already stopped now.
     if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
+	if ($task->{cleanup}->{'nbd-stop'}) {
+	    eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid); };
+	    $self->logerr($@) if $@;
+	}
+
+	if (my $info = $task->{cleanup}->{'backup-access-teardown'}) {
+	    my $params = {
+		'target-id' => $info->{'target-id'},
+		timeout => 60,
+		success => $info->{success} ? JSON::true : JSON::false,
+	    };
+
+	    $self->loginfo("tearing down backup-access");
+	    eval { mon_cmd($vmid, "backup-access-teardown", $params->%*) };
+	    $self->logerr($@) if $@;
+	}
+
 	$detach_tpmstate_drive->($task, $vmid);
 	detach_fleecing_images($task->{disks}, $vmid) if $task->{'use-fleecing'};
     }
@@ -1160,4 +1179,292 @@ sub cleanup {
     }
 }
 
+my sub block_device_backup_cleanup {
+    my ($self, $paths, $cpids) = @_;
+
+    for my $path ($paths->@*) {
+	eval { run_command(["qemu-nbd", "-d", $path ]); };
+	$self->log('warn', "unable to disconnect NBD backup source '$path' - $@") if $@;
+    }
+
+    my $waited;
+    my $wait_limit = 5;
+    for ($waited = 0; $waited < $wait_limit && scalar(keys $cpids->%*); $waited++) {
+	while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) {
+	    delete($cpids->{$cpid});
+	}
+	if ($waited == 0) {
+	    kill 15, $_ for keys $cpids->%*;
+	}
+	sleep 1;
+    }
+    if ($waited == $wait_limit && scalar(keys $cpids->%*)) {
+	kill 9, $_ for keys $cpids->%*;
+	sleep 1;
+	while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) {
+	    delete($cpids->{$cpid});
+	}
+	$self->log('warn', "unable to collect nbdinfo child process '$_'") for keys $cpids->%*;
+    }
+}
+
+my sub block_device_backup_prepare {
+    my ($self, $devicename, $size, $nbd_path, $bitmap_name, $count) = @_;
+
+    my $nbd_info_uri = "nbd+unix:///${devicename}?socket=${nbd_path}";
+    my $qemu_nbd_uri = "nbd:unix:${nbd_path}:exportname=${devicename}";
+
+    my $cpid;
+    my $error_fh;
+    my $next_dirty_region;
+
+    # If there is no dirty bitmap, it can be treated as if there's a full dirty one. The output of
+    # nbdinfo is a list of tuples with offset, length, type, description. The first bit of 'type' is
+    # set when the bitmap is dirty, see QEMU's docs/interop/nbd.txt
+    my $dirty_bitmap = [];
+    if ($bitmap_name) {
+	my $input = IO::File->new();
+	my $info = IO::File->new();
+	$error_fh = IO::File->new();
+	my $nbdinfo_cmd = ["nbdinfo", $nbd_info_uri, "--map=qemu:dirty-bitmap:${bitmap_name}"];
+	$cpid = open3($input, $info, $error_fh, $nbdinfo_cmd->@*)
+	    or die "failed to spawn nbdinfo child - $!\n";
+
+	$next_dirty_region = sub {
+	    my ($offset, $length, $type);
+	    do {
+		my $line = <$info>;
+		return if !$line;
+		die "unexpected output from nbdinfo - $line\n"
+		    if $line !~ m/^\s*(\d+)\s*(\d+)\s*(\d+)/; # also untaints
+		($offset, $length, $type) = ($1, $2, $3);
+	    } while (($type & 0x1) == 0); # not dirty
+	    return ($offset, $length);
+	};
+    } else {
+	my $done = 0;
+	$next_dirty_region = sub {
+	    return if $done;
+	    $done = 1;
+	    return (0, $size);
+	};
+    }
+
+    my $blockdev = "/dev/nbd${count}";
+
+    eval {
+	run_command(["qemu-nbd", "-c", $blockdev, $qemu_nbd_uri, "--format=raw", "--discard=on"]);
+    };
+    if (my $err = $@) {
+	my $cpids = {};
+	$cpids->{$cpid} = 1 if $cpid;
+	block_device_backup_cleanup($self, [$blockdev], $cpids);
+	die $err;
+    }
+
+    return ($blockdev, $next_dirty_region, $cpid);
+}
+
+my sub backup_access_to_volume_info {
+    my ($self, $backup_access_info, $mechanism, $nbd_path) = @_;
+
+    my $child_pids = {}; # used for nbdinfo calls
+    my $count = 0; # counter for block devices, i.e. /dev/nbd${count}
+    my $volumes = {};
+
+    for my $info ($backup_access_info->@*) {
+	my $bitmap_status = 'none';
+	my $bitmap_name;
+	if (my $bitmap_action = $info->{'bitmap-action'}) {
+	    my $bitmap_action_to_status = {
+		'not-used' => 'none',
+		'not-used-removed' => 'none',
+		'new' => 'new',
+		'used' => 'reuse',
+		'invalid' => 'new',
+	    };
+
+	    $bitmap_status = $bitmap_action_to_status->{$bitmap_action}
+		or die "got unexpected bitmap action '$bitmap_action'\n";
+
+	    $bitmap_name = $info->{'bitmap-name'} or die "bitmap-name is not present\n";
+	}
+
+	my ($device, $size) = $info->@{qw(device size)};
+
+	$volumes->{$device}->{'bitmap-mode'} = $bitmap_status;
+	$volumes->{$device}->{size} = $size;
+
+	if ($mechanism eq 'block-device') {
+	    my ($blockdev, $next_dirty_region, $child_pid) = block_device_backup_prepare(
+		$self, $device, $size, $nbd_path, $bitmap_name, $count);
+	    $count++;
+	    $child_pids->{$child_pid} = 1 if $child_pid;
+	    $volumes->{$device}->{path} = $blockdev;
+	    $volumes->{$device}->{'next-dirty-region'} = $next_dirty_region;
+	} elsif ($mechanism eq 'nbd') {
+	    $volumes->{$device}->{'nbd-path'} = $nbd_path;
+	    $volumes->{$device}->{'bitmap-name'} = $bitmap_name;
+	} else {
+	    die "internal error - unkown mechanism '$mechanism'";
+	}
+    }
+
+    return ($volumes, $child_pids);
+}
+
+sub archive_external {
+    my ($self, $task, $vmid) = @_;
+
+    my $guest_config = PVE::Tools::file_get_contents("$task->{tmpdir}/qemu-server.conf");
+    my $firewall_file = "$task->{tmpdir}/qemu-server.fw";
+
+    my $opts = $self->{vzdump}->{opts};
+
+    my $backup_provider = $self->{vzdump}->{'backup-provider'};
+
+    $self->loginfo("starting external backup via " . $backup_provider->provider_name());
+
+    my $starttime = time();
+
+    # get list early so we die on unkown drive types before doing anything
+    my $devlist = _get_task_devlist($task);
+
+    $self->enforce_vm_running_for_backup($vmid);
+    $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
+
+    eval {
+	$SIG{INT} = $SIG{TERM} = $SIG{QUIT} = $SIG{HUP} = $SIG{PIPE} = sub {
+	    die "interrupted by signal\n";
+	};
+
+	my $qemu_support = mon_cmd($vmid, "query-proxmox-support");
+
+	$attach_tpmstate_drive->($self, $task, $vmid);
+
+	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
+
+	my $fleecing = check_and_prepare_fleecing(
+	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 1);
+	die "cannot setup backup access without fleecing\n" if !$fleecing;
+
+	$task->{'use-fleecing'} = 1;
+
+	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
+
+	my $target_id = $opts->{storage};
+
+	my $params = {
+	    'target-id' => $target_id,
+	    devlist => $devlist,
+	    timeout => 60,
+	};
+
+	my ($mechanism, $bitmap_name) = $backup_provider->backup_get_mechanism($vmid, 'qemu');
+	die "mechanism '$mechanism' requested by backup provider is not supported for VMs\n"
+	    if $mechanism ne 'block-device' && $mechanism ne 'nbd';
+
+	if ($mechanism eq 'block-device') {
+	    # For mechanism 'block-device' the bitmap needs to be passed to the provider. The bitmap
+	    # cannot be dumped via QMP and doing it via qemu-img is experimental, so use nbdinfo.
+	    die "need 'nbdinfo' binary from package libnbd-bin\n" if !-e "/usr/bin/nbdinfo";
+
+	    # NOTE nbds_max won't change if module is already loaded
+	    run_command(["modprobe", "nbd", "nbds_max=128"]);
+	}
+
+	if ($bitmap_name) {
+	    # prepend storage ID so different providers can never cause clashes
+	    $bitmap_name = "$opts->{storage}-" . $bitmap_name;
+	    $params->{'bitmap-name'} = $bitmap_name;
+	}
+
+	$self->loginfo("setting up snapshot-access for backup");
+
+	my $backup_access_info = eval { mon_cmd($vmid, "backup-access-setup", $params->%*) };
+	my $qmperr = $@;
+
+	$task->{cleanup}->{'backup-access-teardown'} = { 'target-id' => $target_id, success => 0 };
+
+	if ($fs_frozen) {
+	    $self->qga_fs_thaw($vmid);
+	}
+
+	die $qmperr if $qmperr;
+
+	$self->resume_vm_after_job_start($task, $vmid);
+
+	my $bitmap_info = mon_cmd($vmid, 'query-pbs-bitmap-info');
+	for my $info (sort { $a->{drive} cmp $b->{drive} } $bitmap_info->@*) {
+	    my $text = $bitmap_action_to_human->($self, $info);
+	    my $drive = $info->{drive};
+	    $drive =~ s/^drive-//; # for consistency
+	    $self->loginfo("$drive: dirty-bitmap status: $text");
+	}
+
+	$self->loginfo("starting NBD server");
+
+	my $nbd_path = "/run/qemu-server/$vmid\_nbd.backup_access";
+	mon_cmd(
+	    $vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $nbd_path } } );
+	$task->{cleanup}->{'nbd-stop'} = 1;
+
+	for my $info ($backup_access_info->@*) {
+	    $self->loginfo("adding NBD export for $info->{device}");
+
+	    my $export_params = {
+		id => $info->{device},
+		'node-name' => $info->{'node-name'},
+		writable => JSON::true, # for discard
+		type => "nbd",
+		name => $info->{device}, # NBD export name
+	    };
+
+	    if ($info->{'bitmap-name'}) {
+		$export_params->{bitmaps} = [{
+		    node => $info->{'bitmap-node-name'},
+		    name => $info->{'bitmap-name'},
+		}],
+	    }
+
+	    mon_cmd($vmid, "block-export-add", $export_params->%*);
+	}
+
+	my $child_pids = {}; # used for nbdinfo calls
+	my $volumes = {};
+
+	eval {
+	    ($volumes, $child_pids) =
+		backup_access_to_volume_info($self, $backup_access_info, $mechanism, $nbd_path);
+
+	    my $param = {};
+	    $param->{'bandwidth-limit'} = $opts->{bwlimit} * 1024 if $opts->{bwlimit};
+	    $param->{'firewall-config'} = PVE::Tools::file_get_contents($firewall_file)
+		if -e $firewall_file;
+
+	    $backup_provider->backup_vm($vmid, $guest_config, $volumes, $param);
+	};
+	my $err = $@;
+
+	if ($mechanism eq 'block-device') {
+	    my $cleanup_paths = [map { $volumes->{$_}->{path} } keys $volumes->%*];
+	    block_device_backup_cleanup($self, $cleanup_paths, $child_pids)
+	}
+
+	die $err if $err;
+    };
+    my $err = $@;
+
+    if ($err) {
+	$self->logerr($err);
+	$self->resume_vm_after_job_start($task, $vmid);
+    } else {
+	$task->{size} = $backup_provider->backup_get_task_size($vmid);
+	$task->{cleanup}->{'backup-access-teardown'}->{success} = 1;
+    }
+    $self->restore_vm_power_state($vmid);
+
+    die $err if $err;
+}
+
 1;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH qemu-server v3 22/34] restore: die early when there is no size for a device
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (20 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12  9:28   ` [pve-devel] applied: " Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 23/34] backup: implement restore for external providers Fiona Ebner
                   ` (12 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Makes it a clean error for buggy (external) backup providers where the
size might not be set at all.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 PVE/QemuServer.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 49b6ca17..30e51a8c 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6813,6 +6813,7 @@ my $restore_allocate_devices = sub {
     my $map = {};
     foreach my $virtdev (sort keys %$virtdev_hash) {
 	my $d = $virtdev_hash->{$virtdev};
+	die "got no size for '$virtdev'\n" if !defined($d->{size});
 	my $alloc_size = int(($d->{size} + 1024 - 1)/1024);
 	my $storeid = $d->{storeid};
 	my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu-server v3 23/34] backup: implement restore for external providers
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (21 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 22/34] restore: die early when there is no size for a device Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 24/34] backup restore: external: hardening check for untrusted source image Fiona Ebner
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

First, the provider is asked about what restore mechanism to use.
Currently, only 'qemu-img' is possible. Then the configuration files
are restored, the provider gives information about volumes contained
in the backup and finally the volumes are restored via
'qemu-img convert'.

The code for the restore_external_archive() function was copied and
adapted from the restore_proxmox_backup_archive() function. Together
with restore_vma_archive() it seems sensible to extract the common
parts and use a dedicated module for restore code.

The parse_restore_archive() helper was renamed, because it's not just
parsing.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* use new storage_has_feature() helper

 PVE/API2/Qemu.pm  |  30 +++++++++-
 PVE/QemuServer.pm | 139 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 166 insertions(+), 3 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 1c3cb271..319518e8 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -845,7 +845,7 @@ __PACKAGE__->register_method({
 	return $res;
     }});
 
-my $parse_restore_archive = sub {
+my $classify_restore_archive = sub {
     my ($storecfg, $archive) = @_;
 
     my ($archive_storeid, $archive_volname) = PVE::Storage::parse_volume_id($archive, 1);
@@ -859,6 +859,22 @@ my $parse_restore_archive = sub {
 	    $res->{type} = 'pbs';
 	    return $res;
 	}
+	if (PVE::Storage::storage_has_feature($storecfg, $archive_storeid, 'backup-provider')) {
+	    my $log_function = sub {
+		my ($log_level, $message) = @_;
+		my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
+		print "$prefix: $message\n";
+	    };
+	    my $backup_provider = PVE::Storage::new_backup_provider(
+		$storecfg,
+		$archive_storeid,
+		$log_function,
+	    );
+
+	    $res->{type} = 'external';
+	    $res->{'backup-provider'} = $backup_provider;
+	    return $res;
+	}
     }
     my $path = PVE::Storage::abs_filesystem_path($storecfg, $archive);
     $res->{type} = 'file';
@@ -1011,7 +1027,7 @@ __PACKAGE__->register_method({
 		    'backup',
 		);
 
-		$archive = $parse_restore_archive->($storecfg, $archive);
+		$archive = $classify_restore_archive->($storecfg, $archive);
 	    }
 	}
 
@@ -1069,7 +1085,15 @@ __PACKAGE__->register_method({
 			PVE::QemuServer::check_restore_permissions($rpcenv, $authuser, $merged);
 		    }
 		}
-		if ($archive->{type} eq 'file' || $archive->{type} eq 'pipe') {
+		if (my $backup_provider = $archive->{'backup-provider'}) {
+		    PVE::QemuServer::restore_external_archive(
+			$backup_provider,
+			$archive->{volid},
+			$vmid,
+			$authuser,
+			$restore_options,
+		    );
+		} elsif ($archive->{type} eq 'file' || $archive->{type} eq 'pipe') {
 		    die "live-restore is only compatible with backup images from a Proxmox Backup Server\n"
 			if $live_restore;
 		    PVE::QemuServer::restore_file_archive($archive->{path} // '-', $vmid, $authuser, $restore_options);
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 30e51a8c..f484d048 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -7303,6 +7303,145 @@ sub restore_proxmox_backup_archive {
     }
 }
 
+sub restore_external_archive {
+    my ($backup_provider, $archive, $vmid, $user, $options) = @_;
+
+    die "live restore from backup provider is not implemented\n" if $options->{live};
+
+    my $storecfg = PVE::Storage::config();
+
+    my ($storeid, $volname) = PVE::Storage::parse_volume_id($archive);
+    my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
+
+    my $tmpdir = "/var/tmp/vzdumptmp$$";
+    rmtree $tmpdir;
+    mkpath $tmpdir;
+
+    my $conffile = PVE::QemuConfig->config_file($vmid);
+    # disable interrupts (always do cleanups)
+    local $SIG{INT} =
+	local $SIG{TERM} =
+	local $SIG{QUIT} =
+	local $SIG{HUP} = sub { print STDERR "got interrupt - ignored\n"; };
+
+    # Note: $oldconf is undef if VM does not exists
+    my $cfs_path = PVE::QemuConfig->cfs_config_path($vmid);
+    my $oldconf = PVE::Cluster::cfs_read_file($cfs_path);
+    my $new_conf_raw = '';
+
+    my $rpcenv = PVE::RPCEnvironment::get();
+    my $devinfo = {}; # info about drives included in backup
+    my $virtdev_hash = {}; # info about allocated drives
+
+    eval {
+	# enable interrupts
+	local $SIG{INT} =
+	    local $SIG{TERM} =
+	    local $SIG{QUIT} =
+	    local $SIG{HUP} =
+	    local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
+
+	my $cfgfn = "$tmpdir/qemu-server.conf";
+	my $firewall_config_fn = "$tmpdir/fw.conf";
+
+	my $cmd = "restore";
+
+	my ($mechanism, $vmtype) =
+	    $backup_provider->restore_get_mechanism($volname, $storeid);
+	die "mechanism '$mechanism' requested by backup provider is not supported for VMs\n"
+	    if $mechanism ne 'qemu-img';
+	die "cannot restore non-VM guest of type '$vmtype'\n" if $vmtype ne 'qemu';
+
+	$devinfo = $backup_provider->restore_vm_init($volname, $storeid);
+
+	my $data = $backup_provider->restore_get_guest_config($volname, $storeid)
+	    or die "backup provider failed to extract guest configuration\n";
+	PVE::Tools::file_set_contents($cfgfn, $data);
+
+	if ($data = $backup_provider->restore_get_firewall_config($volname, $storeid)) {
+	    PVE::Tools::file_set_contents($firewall_config_fn, $data);
+	    my $pve_firewall_dir = '/etc/pve/firewall';
+	    mkdir $pve_firewall_dir; # make sure the dir exists
+	    PVE::Tools::file_copy($firewall_config_fn, "${pve_firewall_dir}/$vmid.fw");
+	}
+
+	my $fh = IO::File->new($cfgfn, "r") or die "unable to read qemu-server.conf - $!\n";
+
+	$virtdev_hash = $parse_backup_hints->($rpcenv, $user, $storecfg, $fh, $devinfo, $options);
+
+	# create empty/temp config
+	PVE::Tools::file_set_contents($conffile, "memory: 128\nlock: create");
+
+	$restore_cleanup_oldconf->($storecfg, $vmid, $oldconf, $virtdev_hash) if $oldconf;
+
+	# allocate volumes
+	my $map = $restore_allocate_devices->($storecfg, $virtdev_hash, $vmid);
+
+	for my $virtdev (sort keys $virtdev_hash->%*) {
+	    my $d = $virtdev_hash->{$virtdev};
+	    next if $d->{is_cloudinit}; # no need to restore cloudinit
+
+	    my $info =
+		$backup_provider->restore_vm_volume_init($volname, $storeid, $d->{devname}, {});
+	    my $source_path = $info->{'qemu-img-path'}
+		or die "did not get source image path from backup provider\n";
+	    eval {
+		qemu_img_convert(
+		    $source_path, $d->{volid}, $d->{size}, undef, 0, $options->{bwlimit});
+	    };
+	    my $err = $@;
+	    eval {
+		$backup_provider->restore_vm_volume_cleanup($volname, $storeid, $d->{devname}, {});
+	    };
+	    if (my $cleanup_err = $@) {
+		die $cleanup_err if !$err;
+		warn $cleanup_err;
+	    }
+	    die $err if $err
+	}
+
+	$fh->seek(0, 0) || die "seek failed - $!\n";
+
+	my $cookie = { netcount => 0 };
+	while (defined(my $line = <$fh>)) {
+	    $new_conf_raw .= restore_update_config_line(
+		$cookie,
+		$map,
+		$line,
+		$options->{unique},
+	    );
+	}
+
+	$fh->close();
+    };
+    my $err = $@;
+
+    eval { $backup_provider->restore_vm_cleanup($volname, $storeid); };
+    warn "backup provider cleanup after restore failed - $@" if $@;
+
+    if ($err) {
+	$restore_deactivate_volumes->($storecfg, $virtdev_hash);
+    }
+
+    rmtree $tmpdir;
+
+    if ($err) {
+	$restore_destroy_volumes->($storecfg, $virtdev_hash);
+	die $err;
+    }
+
+    my $new_conf = restore_merge_config($conffile, $new_conf_raw, $options->{override_conf});
+    check_restore_permissions($rpcenv, $user, $new_conf);
+    PVE::QemuConfig->write_config($vmid, $new_conf);
+
+    eval { rescan($vmid, 1); };
+    warn $@ if $@;
+
+    PVE::AccessControl::add_vm_to_pool($vmid, $options->{pool}) if $options->{pool};
+
+    return;
+}
+
 sub pbs_live_restore {
     my ($vmid, $conf, $storecfg, $restored_disks, $opts) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC qemu-server v3 24/34] backup restore: external: hardening check for untrusted source image
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (22 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 23/34] backup: implement restore for external providers Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH container v3 25/34] create: add missing include of PVE::Storage::Plugin Fiona Ebner
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

Actual checking being done depends on Fabian's hardening patches:
https://lore.proxmox.com/pve-devel/20241104104221.228730-1-f.gruenbichler@proxmox.com/

 PVE/QemuServer.pm | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index f484d048..c2e7b4a5 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -7385,6 +7385,12 @@ sub restore_external_archive {
 		$backup_provider->restore_vm_volume_init($volname, $storeid, $d->{devname}, {});
 	    my $source_path = $info->{'qemu-img-path'}
 		or die "did not get source image path from backup provider\n";
+
+	    print "importing drive '$d->{devname}' from '$source_path'\n";
+
+	    # safety check for untrusted source image
+	    PVE::Storage::file_size_info($source_path, undef, 1);
+
 	    eval {
 		qemu_img_convert(
 		    $source_path, $d->{volid}, $d->{size}, undef, 0, $options->{bwlimit});
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH container v3 25/34] create: add missing include of PVE::Storage::Plugin
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (23 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 24/34] backup restore: external: hardening check for untrusted source image Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12 15:22   ` [pve-devel] applied: " Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 26/34] backup: implement backup for external providers Fiona Ebner
                   ` (9 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

used for the shared 'COMMON_TAR_FLAGS' variable.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/PVE/LXC/Create.pm | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
index 117103c..7c5bf0a 100644
--- a/src/PVE/LXC/Create.pm
+++ b/src/PVE/LXC/Create.pm
@@ -8,6 +8,7 @@ use Fcntl;
 
 use PVE::RPCEnvironment;
 use PVE::Storage::PBSPlugin;
+use PVE::Storage::Plugin;
 use PVE::Storage;
 use PVE::DataCenterConfig;
 use PVE::LXC;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC container v3 26/34] backup: implement backup for external providers
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (24 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH container v3 25/34] create: add missing include of PVE::Storage::Plugin Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 27/34] create: factor out tar restore command helper Fiona Ebner
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

The filesystem structure is made available as a directory in a
consistent manner (with details depending on the vzdump backup mode)
just like for regular backup via tar.

The backup_container() method of the backup provider is executed in
a user namespace with the container's ID mapping applied. This allows
the backup provider to see the container's filesystem from the
container's perspective.

The 'prepare' phase of the backup hook is executed right before and
allows the backup provider to prepare for the (usually) unprivileged
execution context in the user namespace.

The backup provider needs to back up the guest and firewall
configuration and the filesystem structure of the container, honoring
file exclusions and the bandwidth limit.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* pass in config as raw data instead of file name
* run backup_container() method in user namespace associated to the
  container
* warn if backing up privileged container to external provider

 src/PVE/VZDump/LXC.pm | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/src/PVE/VZDump/LXC.pm b/src/PVE/VZDump/LXC.pm
index 1928548..d201d8a 100644
--- a/src/PVE/VZDump/LXC.pm
+++ b/src/PVE/VZDump/LXC.pm
@@ -14,6 +14,7 @@ use PVE::LXC::Config;
 use PVE::LXC;
 use PVE::Storage;
 use PVE::Tools;
+use PVE::Env;
 use PVE::VZDump;
 
 use base qw (PVE::VZDump::Plugin);
@@ -124,6 +125,7 @@ sub prepare {
 
     my ($id_map, $root_uid, $root_gid) = PVE::LXC::parse_id_maps($conf);
     $task->{userns_cmd} = PVE::LXC::userns_command($id_map);
+    $task->{id_map} = $id_map;
     $task->{root_uid} = $root_uid;
     $task->{root_gid} = $root_gid;
 
@@ -373,7 +375,41 @@ sub archive {
     my $userns_cmd = $task->{userns_cmd};
     my $findexcl = $self->{vzdump}->{findexcl};
 
-    if ($self->{vzdump}->{opts}->{pbs}) {
+    if (my $backup_provider = $self->{vzdump}->{'backup-provider'}) {
+	$self->loginfo("starting external backup via " . $backup_provider->provider_name());
+
+	if (!scalar($task->{id_map}->@*) || $task->{root_uid} == 0 || $task->{root_gid} == 0) {
+	    $self->log("warn", "external backup of privileged container can only be restored as"
+		." unprivileged which might not work in all cases");
+	}
+
+	my ($mechanism) = $backup_provider->backup_get_mechanism($vmid, 'lxc');
+	die "mechanism '$mechanism' requested by backup provider is not supported for containers\n"
+	    if $mechanism ne 'directory';
+
+	my $guest_config = PVE::Tools::file_get_contents("$tmpdir/etc/vzdump/pct.conf");
+	my $firewall_file = "$tmpdir/etc/vzdump/pct.fw";
+
+	my $info = {
+	    directory => $snapdir,
+	    sources => [@sources],
+	    'backup-user-id' => $task->{root_uid},
+	};
+	$info->{'firewall-config'} = PVE::Tools::file_get_contents($firewall_file)
+	    if -e $firewall_file;
+	$info->{'bandwidth-limit'} = $opts->{bwlimit} * 1024 if $opts->{bwlimit};
+
+	$backup_provider->backup_hook('prepare', $vmid, 'lxc', $info);
+
+	if (scalar($task->{id_map}->@*)) {
+	    PVE::Env::run_in_userns(
+		sub { $backup_provider->backup_container($vmid, $guest_config, $findexcl, $info); },
+		$task->{id_map},
+	    );
+	} else {
+	    $backup_provider->backup_container($vmid, $guest_config, $findexcl, $info);
+	}
+    } elsif ($self->{vzdump}->{opts}->{pbs}) {
 
 	my $param = [];
 	push @$param, "pct.conf:$tmpdir/etc/vzdump/pct.conf";
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC container v3 27/34] create: factor out tar restore command helper
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (25 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 26/34] backup: implement backup for external providers Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12 16:28   ` Fabian Grünbichler
  2024-11-12 17:08   ` [pve-devel] applied: " Thomas Lamprecht
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 28/34] backup: implement restore for external providers Fiona Ebner
                   ` (7 subsequent siblings)
  34 siblings, 2 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

In preparation to re-use it for restore from backup providers.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/PVE/LXC/Create.pm | 42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
index 7c5bf0a..8c8cb9a 100644
--- a/src/PVE/LXC/Create.pm
+++ b/src/PVE/LXC/Create.pm
@@ -59,12 +59,34 @@ sub restore_proxmox_backup_archive {
 	$scfg, $storeid, $cmd, $param, userns_cmd => $userns_cmd);
 }
 
-sub restore_tar_archive {
-    my ($archive, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_;
+my sub restore_tar_archive_command {
+    my ($conf, $opts, $rootdir, $bwlimit) = @_;
 
     my ($id_map, $root_uid, $root_gid) = PVE::LXC::parse_id_maps($conf);
     my $userns_cmd = PVE::LXC::userns_command($id_map);
 
+    my $cmd = [@$userns_cmd, 'tar', 'xpf', '-', $opts->@*, '--totals',
+               @PVE::Storage::Plugin::COMMON_TAR_FLAGS,
+               '-C', $rootdir];
+
+    # skip-old-files doesn't have anything to do with time (old/new), but is
+    # simply -k (annoyingly also called --keep-old-files) without the 'treat
+    # existing files as errors' part... iow. it's bsdtar's interpretation of -k
+    # *sigh*, gnu...
+    push @$cmd, '--skip-old-files';
+    push @$cmd, '--anchored';
+    push @$cmd, '--exclude' , './dev/*';
+
+    if (defined($bwlimit)) {
+	$cmd = [ ['cstream', '-t', $bwlimit*1024], $cmd ];
+    }
+
+    return $cmd;
+}
+
+sub restore_tar_archive {
+    my ($archive, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_;
+
     my $archive_fh;
     my $tar_input = '<&STDIN';
     my @compression_opt;
@@ -92,21 +114,7 @@ sub restore_tar_archive {
 	$tar_input = '<&'.fileno($archive_fh);
     }
 
-    my $cmd = [@$userns_cmd, 'tar', 'xpf', '-', @compression_opt, '--totals',
-               @PVE::Storage::Plugin::COMMON_TAR_FLAGS,
-               '-C', $rootdir];
-
-    # skip-old-files doesn't have anything to do with time (old/new), but is
-    # simply -k (annoyingly also called --keep-old-files) without the 'treat
-    # existing files as errors' part... iow. it's bsdtar's interpretation of -k
-    # *sigh*, gnu...
-    push @$cmd, '--skip-old-files';
-    push @$cmd, '--anchored';
-    push @$cmd, '--exclude' , './dev/*';
-
-    if (defined($bwlimit)) {
-	$cmd = [ ['cstream', '-t', $bwlimit*1024], $cmd ];
-    }
+    my $cmd = restore_tar_archive_command($conf, [@compression_opt], $rootdir, $bwlimit);
 
     if ($archive eq '-') {
 	print "extracting archive from STDIN\n";
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC container v3 28/34] backup: implement restore for external providers
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (26 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 27/34] create: factor out tar restore command helper Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12 16:27   ` Fabian Grünbichler
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 29/34] external restore: don't use 'one-file-system' tar flag when restoring from a directory Fiona Ebner
                   ` (6 subsequent siblings)
  34 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

First, the provider is asked about what restore mechanism to use.
Currently, 'directory' and 'tar' are possible, for restoring either
from a directory containing the full filesystem structure (for which
rsync is used) or a potentially compressed tar file containing the
same.

The new functions are copied and adapted from the existing ones for
PBS or tar and it might be worth to factor out the common parts.

Restore of containers as privileged are prohibited, because the
archives from an external provider are considered less trusted than
from Proxmox VE storages. If ever allowing that in the future, at
least it would be worth extracting the tar archive in a restricted
context (e.g. user namespace with ID mapped mount or seccomp).

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* Use user namespace when restoring directory (and use tar instead of
  rsync, because it is easier to split in privileged and unprivileged
  half)

 src/PVE/LXC/Create.pm | 141 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 141 insertions(+)

diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
index 8c8cb9a..8657ac1 100644
--- a/src/PVE/LXC/Create.pm
+++ b/src/PVE/LXC/Create.pm
@@ -7,6 +7,7 @@ use File::Path;
 use Fcntl;
 
 use PVE::RPCEnvironment;
+use PVE::RESTEnvironment qw(log_warn);
 use PVE::Storage::PBSPlugin;
 use PVE::Storage::Plugin;
 use PVE::Storage;
@@ -26,6 +27,24 @@ sub restore_archive {
 	if ($scfg->{type} eq 'pbs') {
 	    return restore_proxmox_backup_archive($storage_cfg, $archive, $rootdir, $conf, $no_unpack_error, $bwlimit);
 	}
+	if (PVE::Storage::storage_has_feature($storage_cfg, $storeid, 'backup-provider')) {
+	    my $log_function = sub {
+		my ($log_level, $message) = @_;
+		my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
+		print "$prefix: $message\n";
+	    };
+	    my $backup_provider =
+		PVE::Storage::new_backup_provider($storage_cfg, $storeid, $log_function);
+	    return restore_external_archive(
+		$backup_provider,
+		$storeid,
+		$volname,
+		$rootdir,
+		$conf,
+		$no_unpack_error,
+		$bwlimit,
+	    );
+	}
     }
 
     $archive = PVE::Storage::abs_filesystem_path($storage_cfg, $archive) if $archive ne '-';
@@ -127,6 +146,54 @@ sub restore_tar_archive {
     die $err if $err && !$no_unpack_error;
 }
 
+sub restore_external_archive {
+    my ($backup_provider, $storeid, $volname, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_;
+
+    die "refusing to restore privileged container backup from external source\n"
+	if !$conf->{unprivileged};
+
+    my ($mechanism, $vmtype) = $backup_provider->restore_get_mechanism($volname, $storeid);
+    die "cannot restore non-LXC guest of type '$vmtype'\n" if $vmtype ne 'lxc';
+
+    my $info = $backup_provider->restore_container_init($volname, $storeid, {});
+    eval {
+	if ($mechanism eq 'tar') {
+	    my $tar_path = $info->{'tar-path'}
+		or die "did not get path to tar file from backup provider\n";
+	    die "not a regular file '$tar_path'" if !-f $tar_path;
+	    restore_tar_archive($tar_path, $rootdir, $conf, $no_unpack_error, $bwlimit);
+	} elsif ($mechanism eq 'directory') {
+	    my $directory = $info->{'archive-directory'}
+		or die "did not get path to archive directory from backup provider\n";
+	    die "not a directory '$directory'" if !-d $directory;
+
+	    my $create_cmd = [
+		'tar',
+		'cpf',
+		'-',
+		@PVE::Storage::Plugin::COMMON_TAR_FLAGS,
+		"--directory=$directory",
+		'.',
+	    ];
+
+	    my $extract_cmd = restore_tar_archive_command($conf, undef, $rootdir, $bwlimit);
+
+	    eval { PVE::Tools::run_command([$create_cmd, $extract_cmd]); };
+	    die $@ if $@ && !$no_unpack_error;
+	} else {
+	    die "mechanism '$mechanism' requested by backup provider is not supported for LXCs\n";
+	}
+    };
+    my $err = $@;
+    eval { $backup_provider->restore_container_cleanup($volname, $storeid, {}); };
+    if (my $cleanup_err = $@) {
+	die $cleanup_err if !$err;
+	warn $cleanup_err;
+    }
+    die $err if $err;
+
+}
+
 sub recover_config {
     my ($storage_cfg, $volid, $vmid) = @_;
 
@@ -135,6 +202,8 @@ sub recover_config {
 	my $scfg = PVE::Storage::storage_check_enabled($storage_cfg, $storeid);
 	if ($scfg->{type} eq 'pbs') {
 	    return recover_config_from_proxmox_backup($storage_cfg, $volid, $vmid);
+	} elsif (PVE::Storage::storage_has_feature($storage_cfg, $storeid, 'backup-provider')) {
+	    return recover_config_from_external_backup($storage_cfg, $volid, $vmid);
 	}
     }
 
@@ -209,6 +278,26 @@ sub recover_config_from_tar {
     return wantarray ? ($conf, $mp_param) : $conf;
 }
 
+sub recover_config_from_external_backup {
+    my ($storage_cfg, $volid, $vmid) = @_;
+
+    $vmid //= 0;
+
+    my $raw = PVE::Storage::extract_vzdump_config($storage_cfg, $volid);
+
+    my $conf = PVE::LXC::Config::parse_pct_config("/lxc/${vmid}.conf" , $raw);
+
+    delete $conf->{snapshots};
+
+    my $mp_param = {};
+    PVE::LXC::Config->foreach_volume($conf, sub {
+	my ($ms, $mountpoint) = @_;
+	$mp_param->{$ms} = $conf->{$ms};
+    });
+
+    return wantarray ? ($conf, $mp_param) : $conf;
+}
+
 sub restore_configuration {
     my ($vmid, $storage_cfg, $archive, $rootdir, $conf, $restricted, $unique, $skip_fw) = @_;
 
@@ -218,6 +307,26 @@ sub restore_configuration {
 	if ($scfg->{type} eq 'pbs') {
 	    return restore_configuration_from_proxmox_backup($vmid, $storage_cfg, $archive, $rootdir, $conf, $restricted, $unique, $skip_fw);
 	}
+	if (PVE::Storage::storage_has_feature($storage_cfg, $storeid, 'backup-provider')) {
+	    my $log_function = sub {
+		my ($log_level, $message) = @_;
+		my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
+		print "$prefix: $message\n";
+	    };
+	    my $backup_provider =
+		PVE::Storage::new_backup_provider($storage_cfg, $storeid, $log_function);
+	    return restore_configuration_from_external_backup(
+		$backup_provider,
+		$vmid,
+		$storage_cfg,
+		$archive,
+		$rootdir,
+		$conf,
+		$restricted,
+		$unique,
+		$skip_fw,
+	    );
+	}
     }
     restore_configuration_from_etc_vzdump($vmid, $rootdir, $conf, $restricted, $unique, $skip_fw);
 }
@@ -258,6 +367,38 @@ sub restore_configuration_from_proxmox_backup {
     }
 }
 
+sub restore_configuration_from_external_backup {
+    my ($backup_provider, $vmid, $storage_cfg, $archive, $rootdir, $conf, $restricted, $unique, $skip_fw) = @_;
+
+    my ($storeid, $volname) = PVE::Storage::parse_volume_id($archive);
+    my $scfg = PVE::Storage::storage_config($storage_cfg, $storeid);
+
+    my ($vtype, $name, undef, undef, undef, undef, $format) =
+	PVE::Storage::parse_volname($storage_cfg, $archive);
+
+    my $oldconf = recover_config_from_external_backup($storage_cfg, $archive, $vmid);
+
+    sanitize_and_merge_config($conf, $oldconf, $restricted, $unique);
+
+    my $firewall_config =
+	$backup_provider->restore_get_firewall_config($volname, $storeid);
+
+    if ($firewall_config) {
+	my $pve_firewall_dir = '/etc/pve/firewall';
+	my $pct_fwcfg_target = "${pve_firewall_dir}/${vmid}.fw";
+	if ($skip_fw) {
+	    warn "ignoring firewall config from backup archive, lacking API permission to modify firewall.\n";
+	    warn "old firewall configuration in '$pct_fwcfg_target' left in place!\n"
+		if -e $pct_fwcfg_target;
+	} else {
+	    mkdir $pve_firewall_dir; # make sure the directory exists
+	    PVE::Tools::file_set_contents($pct_fwcfg_target, $firewall_config);
+	}
+    }
+
+    return;
+}
+
 sub sanitize_and_merge_config {
     my ($conf, $oldconf, $restricted, $unique) = @_;
 
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC container v3 29/34] external restore: don't use 'one-file-system' tar flag when restoring from a directory
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (27 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 28/34] backup: implement restore for external providers Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 30/34] create: factor out compression option helper Fiona Ebner
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

This gives backup providers more freedom, e.g. mount backed-up mount
point volumes individually.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/PVE/LXC/Create.pm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
index 8657ac1..719f372 100644
--- a/src/PVE/LXC/Create.pm
+++ b/src/PVE/LXC/Create.pm
@@ -167,11 +167,15 @@ sub restore_external_archive {
 		or die "did not get path to archive directory from backup provider\n";
 	    die "not a directory '$directory'" if !-d $directory;
 
+	    # Give backup provider more freedom, e.g. mount backed-up mount point volumes
+	    # individually.
+	    my @flags = grep { $_ ne '--one-file-system' } @PVE::Storage::Plugin::COMMON_TAR_FLAGS;
+
 	    my $create_cmd = [
 		'tar',
 		'cpf',
 		'-',
-		@PVE::Storage::Plugin::COMMON_TAR_FLAGS,
+		@flags,
 		"--directory=$directory",
 		'.',
 	    ];
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC container v3 30/34] create: factor out compression option helper
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (28 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 29/34] external restore: don't use 'one-file-system' tar flag when restoring from a directory Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 31/34] restore tar archive: check potentially untrusted archive Fiona Ebner
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

In preparation to re-use it for checking potentially untrusted
archives.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/PVE/LXC/Create.pm | 51 +++++++++++++++++++++++++------------------
 1 file changed, 30 insertions(+), 21 deletions(-)

diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
index 719f372..d2f675e 100644
--- a/src/PVE/LXC/Create.pm
+++ b/src/PVE/LXC/Create.pm
@@ -78,15 +78,38 @@ sub restore_proxmox_backup_archive {
 	$scfg, $storeid, $cmd, $param, userns_cmd => $userns_cmd);
 }
 
+my sub tar_compression_option {
+    my ($archive) = @_;
+
+    my %compression_map = (
+	'.gz'  => '-z',
+	'.bz2' => '-j',
+	'.xz'  => '-J',
+	'.lzo'  => '--lzop',
+	'.zst'  => '--zstd',
+    );
+    if ($archive =~ /\.tar(\.[^.]+)?$/) {
+	if (defined($1)) {
+	    die "unrecognized compression format: $1\n" if !defined($compression_map{$1});
+	    return $compression_map{$1};
+	}
+	return;
+    } else {
+	die "file does not look like a template archive: $archive\n";
+    }
+}
+
 my sub restore_tar_archive_command {
-    my ($conf, $opts, $rootdir, $bwlimit) = @_;
+    my ($conf, $compression_opt, $rootdir, $bwlimit) = @_;
 
     my ($id_map, $root_uid, $root_gid) = PVE::LXC::parse_id_maps($conf);
     my $userns_cmd = PVE::LXC::userns_command($id_map);
 
-    my $cmd = [@$userns_cmd, 'tar', 'xpf', '-', $opts->@*, '--totals',
-               @PVE::Storage::Plugin::COMMON_TAR_FLAGS,
-               '-C', $rootdir];
+    my $cmd = [@$userns_cmd, 'tar', 'xpf', '-'];
+    push $cmd->@*, $compression_opt if $compression_opt;
+    push $cmd->@*, '--totals';
+    push $cmd->@*, @PVE::Storage::Plugin::COMMON_TAR_FLAGS;
+    push $cmd->@*, '-C', $rootdir;
 
     # skip-old-files doesn't have anything to do with time (old/new), but is
     # simply -k (annoyingly also called --keep-old-files) without the 'treat
@@ -108,24 +131,10 @@ sub restore_tar_archive {
 
     my $archive_fh;
     my $tar_input = '<&STDIN';
-    my @compression_opt;
+    my $compression_opt;
     if ($archive ne '-') {
 	# GNU tar refuses to autodetect this... *sigh*
-	my %compression_map = (
-	    '.gz'  => '-z',
-	    '.bz2' => '-j',
-	    '.xz'  => '-J',
-	    '.lzo'  => '--lzop',
-	    '.zst'  => '--zstd',
-	);
-	if ($archive =~ /\.tar(\.[^.]+)?$/) {
-	    if (defined($1)) {
-		die "unrecognized compression format: $1\n" if !defined($compression_map{$1});
-		@compression_opt = $compression_map{$1};
-	    }
-	} else {
-	    die "file does not look like a template archive: $archive\n";
-	}
+	$compression_opt = tar_compression_option($archive);
 	sysopen($archive_fh, $archive, O_RDONLY)
 	    or die "failed to open '$archive': $!\n";
 	my $flags = $archive_fh->fcntl(Fcntl::F_GETFD(), 0);
@@ -133,7 +142,7 @@ sub restore_tar_archive {
 	$tar_input = '<&'.fileno($archive_fh);
     }
 
-    my $cmd = restore_tar_archive_command($conf, [@compression_opt], $rootdir, $bwlimit);
+    my $cmd = restore_tar_archive_command($conf, $compression_opt, $rootdir, $bwlimit);
 
     if ($archive eq '-') {
 	print "extracting archive from STDIN\n";
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC container v3 31/34] restore tar archive: check potentially untrusted archive
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (29 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 30/34] create: factor out compression option helper Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 32/34] api: add early check against restoring privileged container from external source Fiona Ebner
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

'tar' itself already protects against '..' in component names and
strips absolute member names when extracting (if not used with the
--absolute-names option) and in general seems sane for extracting.
Additionally, the extraction already happens in the user namespace
associated to the container. So for now, start out with some basic
sanity checks. The helper can still be extended with more checks.

Checks:

* list files in archive - will already catch many corrupted/bogus
  archives.

* check that there are at least 10 members - should also catch
  archives not actually containing a container root filesystem or
  structural issues early.

* check that /sbin directory or link exists in archive - ideally the
  check would be for /sbin/init, but this cannot be done efficiently
  before extraction, because it would require to keep track of the
  whole archive to be able to follow symlinks.

* abort if there is a multi-volume member in the archive - cheap and
  is never expected.

Checks that were considered, but not (yet) added:

* abort when a file has unrealistically large size - while this could
  help to detect certain kinds of bogus archives, there can be valid.
  use cases for extremely large sparse files, so it's not clear what
  a good limit would be (1 EiB maybe?). Also, an attacker could just
  adapt to such a limit creating multiple files and the actual
  extraction is already limited by the size of the allocated container
  volume.

* check that /sbin/init exists after extracting - cannot be done
  efficiently before extraction, because it would require to keep
  track of the whole archive to be able to follow symlinks. During
  setup there already is detection of /etc/os-release, so issues with
  the structure will already be noticed. Adding a hard fail for
  untrusted archives would require either passing that information to
  the setup phase or extracting the protected_call method from there
  into a helper.

* adding 'restrict' to the (common) tar flags - the tar manual (not
  the man page) documents: "Disable use of some potentially harmful
  'tar' options.  Currently this option disables shell invocation from
  multi-volume menu.". The flag was introduced in 2005 and this is
  still the only thing it is used for. Trying to restore a
  multi-volume archive already fails without giving multiple '--file'
  arguments and '--multi-volume', so don't bother adding the flag.

* check format of tar file - would require yet another invocation of
  the decompressor and there seems to be no built-in way to just
  display the format with 'tar'. The 'file' program could be used, but
  it seems to not make a distinction between old GNU and GNU and old
  POSIX and POSIX formats, with the old ones being candidates to
  prohibit. So that would leave just detecting the old 'v7' format.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/PVE/LXC/Create.pm | 67 ++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 63 insertions(+), 4 deletions(-)

diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
index d2f675e..bf424f6 100644
--- a/src/PVE/LXC/Create.pm
+++ b/src/PVE/LXC/Create.pm
@@ -99,12 +99,65 @@ my sub tar_compression_option {
     }
 }
 
+# Basic checks trying to detect issues with a potentially untrusted or bogus tar archive.
+# Just listing the files is already a good check against corruption.
+# 'tar' itself already protects against '..' in component names and strips absolute member names
+# when extracting, so no need to check for those here.
+my sub check_tar_archive {
+    my ($archive) = @_;
+
+    print "checking archive..\n";
+
+    # To resolve links to get to 'sbin/init' would mean keeping track of everything in the archive,
+    # because the target might be ordered first. Check only that 'sbin' exists here.
+    my $found_sbin;
+
+    # Just to detect bogus archives, any valid container filesystem should have more than this.
+    my $required_members = 10;
+    my $member_count = 0;
+
+    my $check_file_list = sub {
+	my ($line) = @_;
+
+	# The date is in ISO 8601 format. The last part contains the potentially quoted file name,
+	# potentially followed by some additional info (e.g. where a link points to).
+	my ($type, $perms, $uid, $gid, $size, $date, $time, $file_info) =
+	    $line =~ m!^([a-zA-Z\-])(\S+)\s+(\d+)/(\d+)\s+(\d+)\s+(\S+)\s+(\S+)\s+(.*)$!;
+
+	die "found multi-volume member in archive\n" if $type eq 'M';
+
+	if (!$found_sbin && (
+	    ($file_info =~ m!^(?:\./)?sbin/$! && $type eq 'd')
+	    || ($file_info =~ m!^(?:\./)?sbin ->! && $type eq 'l')
+	    || ($file_info =~ m!^(?:\./)?sbin link to! && $type eq 'h')
+	)) {
+	    $found_sbin = 1;
+	}
+
+	$member_count++;
+    };
+
+    my $compression_opt = tar_compression_option($archive);
+
+    my $cmd = ['tar', '-tvf', $archive];
+    push $cmd->@*, $compression_opt if $compression_opt;
+    push $cmd->@*, '--numeric-owner';
+
+    PVE::Tools::run_command($cmd, outfunc => $check_file_list);
+
+    die "no 'sbin' directory (or link) found in archive '$archive'\n" if !$found_sbin;
+    die "less than 10 members in archive '$archive'\n" if $member_count < $required_members;
+}
+
 my sub restore_tar_archive_command {
-    my ($conf, $compression_opt, $rootdir, $bwlimit) = @_;
+    my ($conf, $compression_opt, $rootdir, $bwlimit, $untrusted) = @_;
 
     my ($id_map, $root_uid, $root_gid) = PVE::LXC::parse_id_maps($conf);
     my $userns_cmd = PVE::LXC::userns_command($id_map);
 
+    die "refusing to restore privileged container backup from external source\n"
+	if $untrusted && ($root_uid == 0 || $root_gid == 0);
+
     my $cmd = [@$userns_cmd, 'tar', 'xpf', '-'];
     push $cmd->@*, $compression_opt if $compression_opt;
     push $cmd->@*, '--totals';
@@ -127,7 +180,7 @@ my sub restore_tar_archive_command {
 }
 
 sub restore_tar_archive {
-    my ($archive, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_;
+    my ($archive, $rootdir, $conf, $no_unpack_error, $bwlimit, $untrusted) = @_;
 
     my $archive_fh;
     my $tar_input = '<&STDIN';
@@ -142,7 +195,12 @@ sub restore_tar_archive {
 	$tar_input = '<&'.fileno($archive_fh);
     }
 
-    my $cmd = restore_tar_archive_command($conf, $compression_opt, $rootdir, $bwlimit);
+    if ($untrusted) {
+	die "cannot verify untrusted archive on STDIN\n" if $archive eq '-';
+	check_tar_archive($archive);
+    }
+
+    my $cmd = restore_tar_archive_command($conf, $compression_opt, $rootdir, $bwlimit, $untrusted);
 
     if ($archive eq '-') {
 	print "extracting archive from STDIN\n";
@@ -170,7 +228,7 @@ sub restore_external_archive {
 	    my $tar_path = $info->{'tar-path'}
 		or die "did not get path to tar file from backup provider\n";
 	    die "not a regular file '$tar_path'" if !-f $tar_path;
-	    restore_tar_archive($tar_path, $rootdir, $conf, $no_unpack_error, $bwlimit);
+	    restore_tar_archive($tar_path, $rootdir, $conf, $no_unpack_error, $bwlimit, 1);
 	} elsif ($mechanism eq 'directory') {
 	    my $directory = $info->{'archive-directory'}
 		or die "did not get path to archive directory from backup provider\n";
@@ -189,6 +247,7 @@ sub restore_external_archive {
 		'.',
 	    ];
 
+	    # archive is trusted, we created it
 	    my $extract_cmd = restore_tar_archive_command($conf, undef, $rootdir, $bwlimit);
 
 	    eval { PVE::Tools::run_command([$create_cmd, $extract_cmd]); };
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC container v3 32/34] api: add early check against restoring privileged container from external source
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (30 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 31/34] restore tar archive: check potentially untrusted archive Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [PATCH manager v3 33/34] ui: backup: also check for backup subtype to classify archive Fiona Ebner
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

While restore_external_archive() already has a check, that happens
after an existing container is destroyed.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

New in v3.

 src/PVE/API2/LXC.pm | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/src/PVE/API2/LXC.pm b/src/PVE/API2/LXC.pm
index 213e518..dca0e35 100644
--- a/src/PVE/API2/LXC.pm
+++ b/src/PVE/API2/LXC.pm
@@ -39,6 +39,17 @@ BEGIN {
     }
 }
 
+my sub assert_not_restore_from_external {
+    my ($archive, $storage_cfg) = @_;
+
+    my ($storeid, undef) = PVE::Storage::parse_volume_id($archive, 1);
+
+    return if !defined($storeid);
+    return if !PVE::Storage::storage_has_feature($storage_cfg, $storeid, 'backup-provider');
+
+    die "refusing to restore privileged container backup from external source\n";
+}
+
 my $check_storage_access_migrate = sub {
     my ($rpcenv, $authuser, $storecfg, $storage, $node) = @_;
 
@@ -408,6 +419,9 @@ __PACKAGE__->register_method({
 			$conf->{unprivileged} = $orig_conf->{unprivileged}
 			    if !defined($unprivileged) && defined($orig_conf->{unprivileged});
 
+			assert_not_restore_from_external($archive, $storage_cfg)
+			    if !$conf->{unprivileged};
+
 			# implicit privileged change is checked here
 			if ($old_conf->{unprivileged} && !$conf->{unprivileged}) {
 			    $rpcenv->check_vm_perm($authuser, $vmid, $pool, ['VM.Allocate']);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [PATCH manager v3 33/34] ui: backup: also check for backup subtype to classify archive
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (31 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 32/34] api: add early check against restoring privileged container from external source Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-07 16:51 ` [pve-devel] [RFC manager v3 34/34] backup: implement backup for external providers Fiona Ebner
  2024-11-12 15:50 ` [pve-devel] partially-applied: [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Thomas Lamprecht
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

In anticipation of future storage plugins that might not have
PBS-specific formats or adhere to the vzdump naming scheme for
backups.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

No changes in v3.

 www/manager6/Utils.js              | 10 ++++++----
 www/manager6/grid/BackupView.js    |  4 ++--
 www/manager6/storage/BackupView.js |  4 ++--
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/www/manager6/Utils.js b/www/manager6/Utils.js
index db86fa9a..a8e4e8ee 100644
--- a/www/manager6/Utils.js
+++ b/www/manager6/Utils.js
@@ -693,12 +693,14 @@ Ext.define('PVE.Utils', {
 	'snippets': gettext('Snippets'),
     },
 
-    volume_is_qemu_backup: function(volid, format) {
-	return format === 'pbs-vm' || volid.match(':backup/vzdump-qemu-');
+    volume_is_qemu_backup: function(volume) {
+	return volume.format === 'pbs-vm' || volume.volid.match(':backup/vzdump-qemu-') ||
+	    volume.subtype === 'qemu';
     },
 
-    volume_is_lxc_backup: function(volid, format) {
-	return format === 'pbs-ct' || volid.match(':backup/vzdump-(lxc|openvz)-');
+    volume_is_lxc_backup: function(volume) {
+	return volume.format === 'pbs-ct' || volume.volid.match(':backup/vzdump-(lxc|openvz)-') ||
+	    volume.subtype === 'lxc';
     },
 
     authSchema: {
diff --git a/www/manager6/grid/BackupView.js b/www/manager6/grid/BackupView.js
index e71d1c88..ef3649c6 100644
--- a/www/manager6/grid/BackupView.js
+++ b/www/manager6/grid/BackupView.js
@@ -29,11 +29,11 @@ Ext.define('PVE.grid.BackupView', {
 	var vmtypeFilter;
 	if (vmtype === 'lxc' || vmtype === 'openvz') {
 	    vmtypeFilter = function(item) {
-		return PVE.Utils.volume_is_lxc_backup(item.data.volid, item.data.format);
+		return PVE.Utils.volume_is_lxc_backup(item.data);
 	    };
 	} else if (vmtype === 'qemu') {
 	    vmtypeFilter = function(item) {
-		return PVE.Utils.volume_is_qemu_backup(item.data.volid, item.data.format);
+		return PVE.Utils.volume_is_qemu_backup(item.data);
 	    };
 	} else {
 	    throw "unsupported VM type '" + vmtype + "'";
diff --git a/www/manager6/storage/BackupView.js b/www/manager6/storage/BackupView.js
index db184def..749c2136 100644
--- a/www/manager6/storage/BackupView.js
+++ b/www/manager6/storage/BackupView.js
@@ -86,9 +86,9 @@ Ext.define('PVE.storage.BackupView', {
 		disabled: true,
 		handler: function(b, e, rec) {
 		    let vmtype;
-		    if (PVE.Utils.volume_is_qemu_backup(rec.data.volid, rec.data.format)) {
+		    if (PVE.Utils.volume_is_qemu_backup(rec.data)) {
 			vmtype = 'qemu';
-		    } else if (PVE.Utils.volume_is_lxc_backup(rec.data.volid, rec.data.format)) {
+		    } else if (PVE.Utils.volume_is_lxc_backup(rec.data)) {
 			vmtype = 'lxc';
 		    } else {
 			return;
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] [RFC manager v3 34/34] backup: implement backup for external providers
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (32 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [PATCH manager v3 33/34] ui: backup: also check for backup subtype to classify archive Fiona Ebner
@ 2024-11-07 16:51 ` Fiona Ebner
  2024-11-12 15:50 ` [pve-devel] partially-applied: [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Thomas Lamprecht
  34 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-07 16:51 UTC (permalink / raw)
  To: pve-devel

Hooks from the backup provider are called during start/end/abort for
both job and backup. And it is necessary to adapt some log messages
and special case some things like is already done for PBS, e.g. log
file handling.

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---

Changes in v3:
* use new storage_has_feature() helper

 PVE/VZDump.pm           | 57 ++++++++++++++++++++++++++++++++++++-----
 test/vzdump_new_test.pl |  3 +++
 2 files changed, 53 insertions(+), 7 deletions(-)

diff --git a/PVE/VZDump.pm b/PVE/VZDump.pm
index fd89945e..0b1b5dc1 100644
--- a/PVE/VZDump.pm
+++ b/PVE/VZDump.pm
@@ -217,7 +217,10 @@ sub storage_info {
     $info->{'prune-backups'} = PVE::JSONSchema::parse_property_string('prune-backups', $scfg->{'prune-backups'})
 	if defined($scfg->{'prune-backups'});
 
-    if ($type eq 'pbs') {
+    if (PVE::Storage::storage_has_feature($cfg, $storage, 'backup-provider')) {
+	$info->{'backup-provider'} =
+	    PVE::Storage::new_backup_provider($cfg, $storage, sub { debugmsg($_[0], $_[1]); });
+    } elsif ($type eq 'pbs') {
 	$info->{pbs} = 1;
     } else {
 	$info->{dumpdir} = PVE::Storage::get_backup_dir($cfg, $storage);
@@ -717,6 +720,7 @@ sub new {
 	    $opts->{scfg} = $info->{scfg};
 	    $opts->{pbs} = $info->{pbs};
 	    $opts->{'prune-backups'} //= $info->{'prune-backups'};
+	    $self->{'backup-provider'} = $info->{'backup-provider'} if $info->{'backup-provider'};
 	}
     } elsif ($opts->{dumpdir}) {
 	$add_error->("dumpdir '$opts->{dumpdir}' does not exist")
@@ -1001,7 +1005,7 @@ sub exec_backup_task {
 	    }
 	}
 
-	if (!$self->{opts}->{pbs}) {
+	if (!$self->{opts}->{pbs} && !$self->{'backup-provider'}) {
 	    $task->{logfile} = "$opts->{dumpdir}/$basename.log";
 	}
 
@@ -1011,7 +1015,11 @@ sub exec_backup_task {
 	    $ext .= ".${comp_ext}";
 	}
 
-	if ($self->{opts}->{pbs}) {
+	if ($self->{'backup-provider'}) {
+	    die "unable to pipe backup to stdout\n" if $opts->{stdout};
+	    $task->{target} = $self->{'backup-provider'}->backup_get_archive_name(
+		$vmid, $vmtype, $task->{backup_time});
+	} elsif ($self->{opts}->{pbs}) {
 	    die "unable to pipe backup to stdout\n" if $opts->{stdout};
 	    $task->{target} = $pbs_snapshot_name;
 	} else {
@@ -1029,7 +1037,7 @@ sub exec_backup_task {
 	my $pid = $$;
 	if ($opts->{tmpdir}) {
 	    $task->{tmpdir} = "$opts->{tmpdir}/vzdumptmp${pid}_$vmid/";
-	} elsif ($self->{opts}->{pbs}) {
+	} elsif ($self->{opts}->{pbs} || $self->{'backup-provider'}) {
 	    $task->{tmpdir} = "/var/tmp/vzdumptmp${pid}_$vmid";
 	} else {
 	    # dumpdir is posix? then use it as temporary dir
@@ -1101,6 +1109,10 @@ sub exec_backup_task {
 	if ($mode eq 'stop') {
 	    $plugin->prepare ($task, $vmid, $mode);
 
+	    if ($self->{'backup-provider'}) {
+		$self->{'backup-provider'}->backup_hook(
+		    'start', $vmid, $vmtype, { 'start-time' => $task->{backup_time} });
+	    }
 	    $self->run_hook_script ('backup-start', $task, $logfd);
 
 	    if ($running) {
@@ -1115,6 +1127,10 @@ sub exec_backup_task {
 	} elsif ($mode eq 'suspend') {
 	    $plugin->prepare ($task, $vmid, $mode);
 
+	    if ($self->{'backup-provider'}) {
+		$self->{'backup-provider'}->backup_hook(
+		    'start', $vmid, $vmtype, { 'start-time' => $task->{backup_time} });
+	    }
 	    $self->run_hook_script ('backup-start', $task, $logfd);
 
 	    if ($vmtype eq 'lxc') {
@@ -1141,6 +1157,10 @@ sub exec_backup_task {
 	    }
 
 	} elsif ($mode eq 'snapshot') {
+	    if ($self->{'backup-provider'}) {
+		$self->{'backup-provider'}->backup_hook(
+		    'start', $vmid, $vmtype, { 'start-time' => $task->{backup_time} });
+	    }
 	    $self->run_hook_script ('backup-start', $task, $logfd);
 
 	    my $snapshot_count = $task->{snapshot_count} || 0;
@@ -1183,11 +1203,13 @@ sub exec_backup_task {
 	    return;
 	}
 
-	my $archive_txt = $self->{opts}->{pbs} ? 'Proxmox Backup Server' : 'vzdump';
+	my $archive_txt = 'vzdump';
+	$archive_txt = 'Proxmox Backup Server' if $self->{opts}->{pbs};
+	$archive_txt = $self->{'backup-provider'}->provider_name() if $self->{'backup-provider'};
 	debugmsg('info', "creating $archive_txt archive '$task->{target}'", $logfd);
 	$plugin->archive($task, $vmid, $task->{tmptar}, $comp);
 
-	if ($self->{opts}->{pbs}) {
+	if ($self->{'backup-provider'} || $self->{opts}->{pbs}) {
 	    # size is added to task struct in guest vzdump plugins
 	} else {
 	    rename ($task->{tmptar}, $task->{target}) ||
@@ -1201,7 +1223,8 @@ sub exec_backup_task {
 
 	# Mark as protected before pruning.
 	if (my $storeid = $opts->{storage}) {
-	    my $volname = $opts->{pbs} ? $task->{target} : basename($task->{target});
+	    my $volname = $opts->{pbs} || $self->{'backup-provider'} ? $task->{target}
+	                                                             : basename($task->{target});
 	    my $volid = "${storeid}:backup/${volname}";
 
 	    if ($opts->{'notes-template'} && $opts->{'notes-template'} ne '') {
@@ -1254,6 +1277,8 @@ sub exec_backup_task {
 	    debugmsg ('info', "pruned $pruned backup(s)${log_pruned_extra}", $logfd);
 	}
 
+	$self->{'backup-provider'}->backup_hook('end', $vmid, $vmtype, {})
+	    if $self->{'backup-provider'};
 	$self->run_hook_script ('backup-end', $task, $logfd);
     };
     my $err = $@;
@@ -1313,6 +1338,14 @@ sub exec_backup_task {
 	debugmsg ('err', "Backup of VM $vmid failed - $err", $logfd, 1);
 	debugmsg ('info', "Failed at " . strftime("%F %H:%M:%S", localtime()));
 
+	if ($self->{'backup-provider'}) {
+	    eval {
+		$self->{'backup-provider'}->backup_hook(
+		    'abort', $vmid, $task->{vmtype}, { error => $err });
+	    };
+	    debugmsg('warn', "hook 'backup-abort' for external provider failed - $@") if $@;
+	}
+
 	eval { $self->run_hook_script ('backup-abort', $task, $logfd); };
 	debugmsg('warn', $@) if $@; # message already contains command with phase name
 
@@ -1340,6 +1373,8 @@ sub exec_backup_task {
 		};
 		debugmsg('warn', "$@") if $@; # $@ contains already error prefix
 	    }
+	} elsif ($self->{'backup-provider'}) {
+	    $self->{'backup-provider'}->backup_handle_log_file($vmid, $task->{tmplog});
 	} elsif ($task->{logfile}) {
 	    system {'cp'} 'cp', $task->{tmplog}, $task->{logfile};
 	}
@@ -1398,6 +1433,8 @@ sub exec_backup {
     my $errcount = 0;
     eval {
 
+	$self->{'backup-provider'}->job_hook('start', { 'start-time' => $starttime })
+	    if $self->{'backup-provider'};
 	$self->run_hook_script ('job-start', undef, $job_start_fd);
 
 	foreach my $task (@$tasklist) {
@@ -1405,11 +1442,17 @@ sub exec_backup {
 	    $errcount += 1 if $task->{state} ne 'ok';
 	}
 
+	$self->{'backup-provider'}->job_hook('end') if $self->{'backup-provider'};
 	$self->run_hook_script ('job-end', undef, $job_end_fd);
     };
     my $err = $@;
 
     if ($err) {
+	if ($self->{'backup-provider'}) {
+	    eval { $self->{'backup-provider'}->job_hook('abort', { error => $err }); };
+	    $err .= "hook 'job-abort' for external provider failed - $@" if $@;
+	}
+
 	eval { $self->run_hook_script ('job-abort', undef, $job_end_fd); };
 	$err .= $@ if $@;
 	debugmsg ('err', "Backup job failed - $err", undef, 1);
diff --git a/test/vzdump_new_test.pl b/test/vzdump_new_test.pl
index 8cd73075..01f2a661 100755
--- a/test/vzdump_new_test.pl
+++ b/test/vzdump_new_test.pl
@@ -51,6 +51,9 @@ $pve_storage_module->mock(
     activate_storage => sub {
 	return;
     },
+    get_backup_provider => sub {
+	return;
+    },
 );
 
 my $pve_cluster_module = Test::MockModule->new('PVE::Cluster');
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] applied: [PATCH qemu-server v3 16/34] move nbd_stop helper to QMPHelpers module
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 16/34] move nbd_stop helper to QMPHelpers module Fiona Ebner
@ 2024-11-11 13:55   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-11 13:55 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> Like this nbd_stop() can be called from a module that cannot include
> QemuServer.pm.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> No changes in v3.
> 
>  PVE/API2/Qemu.pm             | 3 ++-
>  PVE/CLI/qm.pm                | 3 ++-
>  PVE/QemuServer.pm            | 6 ------
>  PVE/QemuServer/QMPHelpers.pm | 6 ++++++
>  4 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 848001b6..1c3cb271 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -35,6 +35,7 @@ use PVE::QemuServer::Monitor qw(mon_cmd);
>  use PVE::QemuServer::Machine;
>  use PVE::QemuServer::Memory qw(get_current_memory);
>  use PVE::QemuServer::PCI;
> +use PVE::QemuServer::QMPHelpers;
>  use PVE::QemuServer::USB;
>  use PVE::QemuMigrate;
>  use PVE::RPCEnvironment;
> @@ -5910,7 +5911,7 @@ __PACKAGE__->register_method({
>  		    return;
>  		},
>  		'nbdstop' => sub {
> -		    PVE::QemuServer::nbd_stop($state->{vmid});
> +		    PVE::QemuServer::QMPHelpers::nbd_stop($state->{vmid});
>  		    return;
>  		},
>  		'resume' => sub {
> diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
> index 8d8ce10a..47b87782 100755
> --- a/PVE/CLI/qm.pm
> +++ b/PVE/CLI/qm.pm
> @@ -35,6 +35,7 @@ use PVE::QemuServer::Agent qw(agent_available);
>  use PVE::QemuServer::ImportDisk;
>  use PVE::QemuServer::Monitor qw(mon_cmd);
>  use PVE::QemuServer::OVF;
> +use PVE::QemuServer::QMPHelpers;
>  use PVE::QemuServer;
>  
>  use PVE::CLIHandler;
> @@ -385,7 +386,7 @@ __PACKAGE__->register_method ({
>  
>  	my $vmid = $param->{vmid};
>  
> -	eval { PVE::QemuServer::nbd_stop($vmid) };
> +	eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid) };
>  	warn $@ if $@;
>  
>  	return;
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 0df3bda0..49b6ca17 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -8606,12 +8606,6 @@ sub generate_smbios1_uuid {
>      return "uuid=".generate_uuid();
>  }
>  
> -sub nbd_stop {
> -    my ($vmid) = @_;
> -
> -    mon_cmd($vmid, 'nbd-server-stop', timeout => 25);
> -}
> -
>  sub create_reboot_request {
>      my ($vmid) = @_;
>      open(my $fh, '>', "/run/qemu-server/$vmid.reboot")
> diff --git a/PVE/QemuServer/QMPHelpers.pm b/PVE/QemuServer/QMPHelpers.pm
> index 0269ea46..826938de 100644
> --- a/PVE/QemuServer/QMPHelpers.pm
> +++ b/PVE/QemuServer/QMPHelpers.pm
> @@ -15,6 +15,12 @@ qemu_objectadd
>  qemu_objectdel
>  );
>  
> +sub nbd_stop {
> +    my ($vmid) = @_;
> +
> +    mon_cmd($vmid, 'nbd-server-stop', timeout => 25);
> +}
> +
>  sub qemu_deviceadd {
>      my ($vmid, $devicefull) = @_;
>  
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3 19/34] backup: keep track of block-node size for fleecing
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 19/34] backup: keep track of block-node size for fleecing Fiona Ebner
@ 2024-11-11 14:22   ` Fabian Grünbichler
  2024-11-12  9:50     ` Fiona Ebner
  0 siblings, 1 reply; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-11 14:22 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> For fleecing, the size needs to match exactly what QEMU sees. In
> particular, EFI disks might be attached with a 'size=' option, meaning
> that size can be different from the volume's size. Commit 36377acf
> ("backup: disk info: also keep track of size") introduced size
> tracking and it was used for fleecing since then, but the accurate
> size information needs to be queried via QMP.
> 
> Should also help with the following issue reported in the community
> forum:
> https://forum.proxmox.com/threads/152202
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v3:
> * only use query-block QMP command after the VM is enforced running
> 
>  PVE/VZDump/QemuServer.pm | 37 ++++++++++++++++++++++++++++++++-----
>  1 file changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
> index c46e607c..1ebafe6d 100644
> --- a/PVE/VZDump/QemuServer.pm
> +++ b/PVE/VZDump/QemuServer.pm
> @@ -551,7 +551,7 @@ my sub allocate_fleecing_images {
>  		my $name = "vm-$vmid-fleece-$n";
>  		$name .= ".$format" if $scfg->{path};
>  
> -		my $size = PVE::Tools::convert_size($di->{size}, 'b' => 'kb');
> +		my $size = PVE::Tools::convert_size($di->{'block-node-size'}, 'b' => 'kb');
>  
>  		$di->{'fleece-volid'} = PVE::Storage::vdisk_alloc(
>  		    $self->{storecfg}, $fleecing_storeid, $vmid, $format, $name, $size);
> @@ -600,7 +600,7 @@ my sub attach_fleecing_images {
>  	    my $drive = "file=$path,if=none,id=$devid,format=$format,discard=unmap";
>  	    # Specify size explicitly, to make it work if storage backend rounded up size for
>  	    # fleecing image when allocating.
> -	    $drive .= ",size=$di->{size}" if $format eq 'raw';
> +	    $drive .= ",size=$di->{'block-node-size'}" if $format eq 'raw';
>  	    $drive =~ s/\\/\\\\/g;
>  	    my $ret = PVE::QemuServer::Monitor::hmp_cmd($vmid, "drive_add auto \"$drive\"", 60);
>  	    die "attaching fleecing image $volid failed - $ret\n" if $ret !~ m/OK/s;
> @@ -609,7 +609,7 @@ my sub attach_fleecing_images {
>  }
>  
>  my sub check_and_prepare_fleecing {
> -    my ($self, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = @_;
> +    my ($self, $task, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = @_;

$disks here is $task->{disks} (see below)

>  
>      # Even if the VM was started specifically for fleecing, it's possible that the VM is resumed and
>      # then starts doing IO. For VMs that are not resumed the fleecing images will just stay empty,
> @@ -626,6 +626,8 @@ my sub check_and_prepare_fleecing {
>      }
>  
>      if ($use_fleecing) {
> +	$self->query_block_node_sizes($vmid, $task);

query_block_node_sizes only uses $task->{disks}

> +
>  	my ($default_format, $valid_formats) = PVE::Storage::storage_default_format(
>  	    $self->{storecfg}, $fleecing_opts->{storage});
>  	my $format = scalar(grep { $_ eq 'qcow2' } $valid_formats->@*) ? 'qcow2' : 'raw';
> @@ -721,7 +723,7 @@ sub archive_pbs {
>  	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
>  
>  	$task->{'use-fleecing'} = check_and_prepare_fleecing(
> -	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
> +	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
>  
>  	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
>  
> @@ -905,7 +907,7 @@ sub archive_vma {
>  	$attach_tpmstate_drive->($self, $task, $vmid);
>  
>  	$task->{'use-fleecing'} = check_and_prepare_fleecing(
> -	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
> +	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);

here we can see $task->{disks} being passed in

>  
>  	my $outfh;
>  	if ($opts->{stdout}) {
> @@ -1042,6 +1044,31 @@ sub qga_fs_thaw {
>      $self->logerr($@) if $@;
>  }
>  
> +# The size for fleecing images needs to be exactly the same size as QEMU sees. E.g. EFI disk can bex
> +# attached with a smaller size then the underyling image on the storage.
> +sub query_block_node_sizes {
> +    my ($self, $vmid, $task) = @_;
> +
> +    my $block_info = mon_cmd($vmid, "query-block");
> +    $block_info = { map { $_->{device} => $_ } $block_info->@* };
> +
> +    for my $diskinfo ($task->{disks}->@*) {

only usage of $task

so we don't actually need to add $task as parameter to the two existing
subs, but can just modify this here to take $task->{disks} directly? or
did I overlook something?

if we do have to keep $task as parameter, it should come before $vmid in
the argument list, to be consistent with the rest..

other than that, consider this patch

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

> +	my $drive_key = $diskinfo->{virtdev};
> +	$drive_key .= "-backup" if $drive_key eq 'tpmstate0';
> +	my $block_node_size =
> +	    eval { $block_info->{"drive-$drive_key"}->{inserted}->{image}->{'virtual-size'}; };
> +	if (!$block_node_size) {
> +	    $self->loginfo(
> +		"could not determine block node size of drive '$drive_key' - using fallback");
> +	    $block_node_size = $diskinfo->{size}
> +		or die "could not determine size of drive '$drive_key'\n";
> +	}
> +	$diskinfo->{'block-node-size'} = $block_node_size;
> +    }
> +
> +    return;
> +}
> +
>  # we need a running QEMU/KVM process for backup, starts a paused (prelaunch)
>  # one if VM isn't already running
>  sub enforce_vm_running_for_backup {
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace
  2024-11-07 16:51 ` [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace Fiona Ebner
@ 2024-11-11 18:33   ` Thomas Lamprecht
  2024-11-12 10:19     ` Fiona Ebner
  2024-11-12 14:20   ` Fabian Grünbichler
  1 sibling, 1 reply; 63+ messages in thread
From: Thomas Lamprecht @ 2024-11-11 18:33 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 07.11.24 um 17:51 schrieb Fiona Ebner:
> The first use case is running the container backup subroutine for
> external providers inside a user namespace. That allows them to see
> the filesystem to back-up from the containers perspective and also
> improves security because of isolation.
> 
> Copied and adapted the relevant parts from the pve-buildpkg
> repository.
> 
> Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
> [FE: add $idmap parameter, drop $aux_groups parameter]
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> New in v3.
> 
>  src/Makefile   |   1 +
>  src/PVE/Env.pm | 136 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 137 insertions(+)
>  create mode 100644 src/PVE/Env.pm
> 
> diff --git a/src/Makefile b/src/Makefile
> index 2d8bdc4..dba26e3 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -15,6 +15,7 @@ LIB_SOURCES = \
>  	Certificate.pm \
>  	CpuSet.pm \
>  	Daemon.pm \
> +	Env.pm \
>  	Exception.pm \
>  	Format.pm \
>  	INotify.pm \
> diff --git a/src/PVE/Env.pm b/src/PVE/Env.pm
> new file mode 100644
> index 0000000..e11bec0
> --- /dev/null
> +++ b/src/PVE/Env.pm
> @@ -0,0 +1,136 @@
> +package PVE::Env;

can this module and it's name be more specific to doing stuff with/in namespaces?

e.g. PVE::Namespaces or PVE::Sys::Namespaces (there might be other stuff that might
fit well in a future libproxmox-sys-perl and Proxmox::Sys::* respectively, so
maybe that module path would be better?)

I'd also make all sub's private if not really intended to be used outside
this module.

If the more general fork/wait-child helpers are needed elsewhere, or deemed
to be useful, then they could go in their own module, like e.g. PVE::Sys::Process

> +
> +use strict;
> +use warnings;
> +
> +use Fcntl qw(O_WRONLY);
> +use POSIX qw(EINTR);
> +use Socket;
> +
> +require qw(syscall.ph);
> +
> +use constant {CLONE_NEWNS   => 0x00020000,
> +              CLONE_NEWUSER => 0x10000000};
> +
> +sub unshare($) {
> +    my ($flags) = @_;
> +    return 0 == syscall(272, $flags);
> +}
> +
> +sub __set_id_map($$$) {
> +    my ($pid, $what, $value) = @_;
> +    sysopen(my $fd, "/proc/$pid/${what}_map", O_WRONLY)
> +	or die "failed to open child process' ${what}_map\n";
> +    my $rc = syswrite($fd, $value);
> +    if (!$rc || $rc != length($value)) {
> +	die "failed to set sub$what: $!\n";
> +    }
> +    close($fd);
> +}
> +
> +sub set_id_map($$) {
> +    my ($pid, $id_map) = @_;
> +
> +    my $gid_map = '';
> +    my $uid_map = '';
> +
> +    for my $map ($id_map->@*) {
> +	my ($type, $ct, $host, $length) = $map->@*;
> +
> +	$gid_map .= "$ct $host $length\n" if $type eq 'g';
> +	$uid_map .= "$ct $host $length\n" if $type eq 'u';
> +    }
> +
> +    __set_id_map($pid, 'gid', $gid_map) if $gid_map;
> +    __set_id_map($pid, 'uid', $uid_map) if $uid_map;
> +}
> +
> +sub wait_for_child($;$) {
> +    my ($pid, $noerr) = @_;
> +    my $interrupts = 0;
> +    while (waitpid($pid, 0) != $pid) {
> +	if ($! == EINTR) {
> +	    warn "interrupted...\n";
> +	    kill(($interrupts > 3 ? 9 : 15), $pid);
> +	    $interrupts++;
> +	}
> +    }
> +    my $status = POSIX::WEXITSTATUS($?);
> +    return $status if $noerr;
> +
> +    if ($? == -1) {
> +	die "failed to execute\n";
> +    } elsif (POSIX::WIFSIGNALED($?)) {
> +	my $sig = POSIX::WTERMSIG($?);
> +	die "got signal $sig\n";
> +    } elsif ($status != 0) {
> +	warn "exit code $status\n";
> +    }
> +    return $status;
> +}
> +
> +sub forked(&%) {

FWIW, there's some "forked" method in test/lock_file.pl that this might replace too,
if it stay public.

> +    my ($code, %opts) = @_;
> +
> +    pipe(my $except_r, my $except_w) or die "pipe: $!\n";
> +
> +    my $pid = fork();
> +    die "fork failed: $!\n" if !defined($pid);
> +
> +    if ($pid == 0) {
> +	close($except_r);
> +	eval { $code->() };
> +	if ($@) {
> +	    print {$except_w} $@;
> +	    $except_w->flush();
> +	    POSIX::_exit(1);
> +	}
> +	POSIX::_exit(0);
> +    }
> +    close($except_w);
> +
> +    my $err;
> +    if (my $afterfork = $opts{afterfork}) {
> +	eval { $afterfork->($pid); };
> +	if ($err = $@) {
> +	    kill(15, $pid);
> +	    $opts{noerr} = 1;
> +	}
> +    }
> +    if (!$err) {
> +	$err = do { local $/ = undef; <$except_r> };
> +    }
> +    my $rv = wait_for_child($pid, $opts{noerr});
> +    die $err if $err;
> +    die "an unknown error occurred\n" if $rv != 0;
> +    return $rv;
> +}
> +
> +sub run_in_userns(&;$) {
> +    my ($code, $id_map) = @_;
> +    socketpair(my $sp, my $sc, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
> +	or die "socketpair: $!\n";
> +    forked(sub {
> +	close($sp);
> +	unshare(CLONE_NEWUSER|CLONE_NEWNS) or die "unshare(NEWUSER|NEWNS): $!\n";
> +	syswrite($sc, "1\n") == 2 or die "write: $!\n";
> +	shutdown($sc, 1);
> +	my $two = <$sc>;
> +	die "failed to sync with parent process\n" if $two ne "2\n";
> +	close($sc);
> +	$! = undef;
> +	($(, $)) = (0, 0); die "$!\n" if $!;
> +	($<, $>) = (0, 0); die "$!\n" if $!;
> +	$code->();
> +    }, afterfork => sub {
> +	my ($pid) = @_;
> +	close($sc);
> +	my $one = <$sp>;
> +	die "failed to sync with userprocess\n" if $one ne "1\n";
> +	set_id_map($pid, $id_map);
> +	syswrite($sp, "2\n") == 2 or die "write: $!\n";
> +	close($sp);
> +    });
> +}
> +
> +1;



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu-server v3 20/34] backup: allow adding fleecing images also for EFI and TPM
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 20/34] backup: allow adding fleecing images also for EFI and TPM Fiona Ebner
@ 2024-11-12  9:26   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12  9:26 UTC (permalink / raw)
  To: Proxmox VE development discussion

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

but possibly needs a rebase in case the changes from patch #19 are
adapted based on my feedback ;)

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> For the external backup API, it will be necessary to add a fleecing
> image even for small disks like EFI and TPM, because there is no other
> place the old data could be copied to when a new guest write comes in.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v3:
> * adapt to context changes from previous patch
> 
>  PVE/VZDump/QemuServer.pm | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
> index 1ebafe6d..b6dcd6cc 100644
> --- a/PVE/VZDump/QemuServer.pm
> +++ b/PVE/VZDump/QemuServer.pm
> @@ -534,7 +534,7 @@ my sub cleanup_fleecing_images {
>  }
>  
>  my sub allocate_fleecing_images {
> -    my ($self, $disks, $vmid, $fleecing_storeid, $format) = @_;
> +    my ($self, $disks, $vmid, $fleecing_storeid, $format, $all_images) = @_;
>  
>      die "internal error - no fleecing storage specified\n" if !$fleecing_storeid;
>  
> @@ -545,7 +545,8 @@ my sub allocate_fleecing_images {
>  	my $n = 0; # counter for fleecing image names
>  
>  	for my $di ($disks->@*) {
> -	    next if $di->{virtdev} =~ m/^(?:tpmstate|efidisk)\d$/; # too small to be worth it
> +	    # EFI/TPM are usually too small to be worth it, but it's required for external providers
> +	    next if !$all_images && $di->{virtdev} =~ m/^(?:tpmstate|efidisk)\d$/;
>  	    if ($di->{type} eq 'block' || $di->{type} eq 'file') {
>  		my $scfg = PVE::Storage::storage_config($self->{storecfg}, $fleecing_storeid);
>  		my $name = "vm-$vmid-fleece-$n";
> @@ -609,7 +610,7 @@ my sub attach_fleecing_images {
>  }
>  
>  my sub check_and_prepare_fleecing {
> -    my ($self, $task, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support) = @_;
> +    my ($self, $task, $vmid, $fleecing_opts, $disks, $is_template, $qemu_support, $all_images) = @_;
>  
>      # Even if the VM was started specifically for fleecing, it's possible that the VM is resumed and
>      # then starts doing IO. For VMs that are not resumed the fleecing images will just stay empty,
> @@ -632,7 +633,8 @@ my sub check_and_prepare_fleecing {
>  	    $self->{storecfg}, $fleecing_opts->{storage});
>  	my $format = scalar(grep { $_ eq 'qcow2' } $valid_formats->@*) ? 'qcow2' : 'raw';
>  
> -	allocate_fleecing_images($self, $disks, $vmid, $fleecing_opts->{storage}, $format);
> +	allocate_fleecing_images(
> +	    $self, $disks, $vmid, $fleecing_opts->{storage}, $format, $all_images);
>  	attach_fleecing_images($self, $disks, $vmid, $format);
>      }
>  
> @@ -723,7 +725,7 @@ sub archive_pbs {
>  	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
>  
>  	$task->{'use-fleecing'} = check_and_prepare_fleecing(
> -	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
> +	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 0);
>  
>  	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
>  
> @@ -907,7 +909,7 @@ sub archive_vma {
>  	$attach_tpmstate_drive->($self, $task, $vmid);
>  
>  	$task->{'use-fleecing'} = check_and_prepare_fleecing(
> -	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
> +	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 0);
>  
>  	my $outfh;
>  	if ($opts->{stdout}) {
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] applied: [PATCH qemu-server v3 18/34] backup: cleanup: check if VM is running before issuing QMP commands
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 18/34] backup: cleanup: check if VM is running before issuing QMP commands Fiona Ebner
@ 2024-11-12  9:26   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12  9:26 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> When the VM is only started for backup, the VM will be stopped at that
> point again. While the detach helpers do not warn about errors
> currently, that might change in the future. This is also in
> preparation for other cleanup QMP helpers that are more verbose about
> failure.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> No changes in v3.
> 
>  PVE/VZDump/QemuServer.pm | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
> index b2ced154..c46e607c 100644
> --- a/PVE/VZDump/QemuServer.pm
> +++ b/PVE/VZDump/QemuServer.pm
> @@ -1118,13 +1118,14 @@ sub snapshot {
>  sub cleanup {
>      my ($self, $task, $vmid) = @_;
>  
> -    $detach_tpmstate_drive->($task, $vmid);
> -
> -    if ($task->{'use-fleecing'}) {
> -	detach_fleecing_images($task->{disks}, $vmid);
> -	cleanup_fleecing_images($self, $task->{disks});
> +    # If VM was started only for backup, it is already stopped now.
> +    if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
> +	$detach_tpmstate_drive->($task, $vmid);
> +	detach_fleecing_images($task->{disks}, $vmid) if $task->{'use-fleecing'};
>      }
>  
> +    cleanup_fleecing_images($self, $task->{disks}) if $task->{'use-fleecing'};
> +
>      if ($self->{qmeventd_fh}) {
>  	close($self->{qmeventd_fh});
>      }
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] applied: [PATCH qemu-server v3 17/34] backup: move cleanup of fleecing images to cleanup method
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 17/34] backup: move cleanup of fleecing images to cleanup method Fiona Ebner
@ 2024-11-12  9:26   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12  9:26 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> TPM drives are already detached there and it's better to group
> these things together.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> No changes in v3.
> 
>  PVE/VZDump/QemuServer.pm | 25 +++++++++----------------
>  1 file changed, 9 insertions(+), 16 deletions(-)
> 
> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
> index 012c9210..b2ced154 100644
> --- a/PVE/VZDump/QemuServer.pm
> +++ b/PVE/VZDump/QemuServer.pm
> @@ -690,7 +690,6 @@ sub archive_pbs {
>  
>      # get list early so we die on unkown drive types before doing anything
>      my $devlist = _get_task_devlist($task);
> -    my $use_fleecing;
>  
>      $self->enforce_vm_running_for_backup($vmid);
>      $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
> @@ -721,7 +720,7 @@ sub archive_pbs {
>  
>  	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
>  
> -	$use_fleecing = check_and_prepare_fleecing(
> +	$task->{'use-fleecing'} = check_and_prepare_fleecing(
>  	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
>  
>  	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
> @@ -735,7 +734,7 @@ sub archive_pbs {
>  	    devlist => $devlist,
>  	    'config-file' => $conffile,
>  	};
> -	$params->{fleecing} = JSON::true if $use_fleecing;
> +	$params->{fleecing} = JSON::true if $task->{'use-fleecing'};
>  
>  	if (defined(my $ns = $scfg->{namespace})) {
>  	    $params->{'backup-ns'} = $ns;
> @@ -784,11 +783,6 @@ sub archive_pbs {
>      }
>      $self->restore_vm_power_state($vmid);
>  
> -    if ($use_fleecing) {
> -	detach_fleecing_images($task->{disks}, $vmid);
> -	cleanup_fleecing_images($self, $task->{disks});
> -    }
> -
>      die $err if $err;
>  }
>  
> @@ -891,7 +885,6 @@ sub archive_vma {
>      }
>  
>      my $devlist = _get_task_devlist($task);
> -    my $use_fleecing;
>  
>      $self->enforce_vm_running_for_backup($vmid);
>      $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
> @@ -911,7 +904,7 @@ sub archive_vma {
>  
>  	$attach_tpmstate_drive->($self, $task, $vmid);
>  
> -	$use_fleecing = check_and_prepare_fleecing(
> +	$task->{'use-fleecing'} = check_and_prepare_fleecing(
>  	    $self, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support);
>  
>  	my $outfh;
> @@ -942,7 +935,7 @@ sub archive_vma {
>  		devlist => $devlist
>  	    };
>  	    $params->{'firewall-file'} = $firewall if -e $firewall;
> -	    $params->{fleecing} = JSON::true if $use_fleecing;
> +	    $params->{fleecing} = JSON::true if $task->{'use-fleecing'};
>  	    add_backup_performance_options($params, $opts->{performance}, $qemu_support);
>  
>  	    $qmpclient->queue_cmd($vmid, $backup_cb, 'backup', %$params);
> @@ -984,11 +977,6 @@ sub archive_vma {
>  
>      $self->restore_vm_power_state($vmid);
>  
> -    if ($use_fleecing) {
> -	detach_fleecing_images($task->{disks}, $vmid);
> -	cleanup_fleecing_images($self, $task->{disks});
> -    }
> -
>      if ($err) {
>  	if ($cpid) {
>  	    kill(9, $cpid);
> @@ -1132,6 +1120,11 @@ sub cleanup {
>  
>      $detach_tpmstate_drive->($task, $vmid);
>  
> +    if ($task->{'use-fleecing'}) {
> +	detach_fleecing_images($task->{disks}, $vmid);
> +	cleanup_fleecing_images($self, $task->{disks});
> +    }
> +
>      if ($self->{qmeventd_fh}) {
>  	close($self->{qmeventd_fh});
>      }
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] applied: [PATCH qemu-server v3 22/34] restore: die early when there is no size for a device
  2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 22/34] restore: die early when there is no size for a device Fiona Ebner
@ 2024-11-12  9:28   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12  9:28 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> Makes it a clean error for buggy (external) backup providers where the
> size might not be set at all.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> No changes in v3.
> 
>  PVE/QemuServer.pm | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 49b6ca17..30e51a8c 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -6813,6 +6813,7 @@ my $restore_allocate_devices = sub {
>      my $map = {};
>      foreach my $virtdev (sort keys %$virtdev_hash) {
>  	my $d = $virtdev_hash->{$virtdev};
> +	die "got no size for '$virtdev'\n" if !defined($d->{size});
>  	my $alloc_size = int(($d->{size} + 1024 - 1)/1024);
>  	my $storeid = $d->{storeid};
>  	my $scfg = PVE::Storage::storage_config($storecfg, $storeid);
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3 19/34] backup: keep track of block-node size for fleecing
  2024-11-11 14:22   ` Fabian Grünbichler
@ 2024-11-12  9:50     ` Fiona Ebner
  0 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-12  9:50 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

On 11.11.24 3:22 PM, Fabian Grünbichler wrote:
> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>> @@ -1042,6 +1044,31 @@ sub qga_fs_thaw {
>>      $self->logerr($@) if $@;
>>  }
>>  
>> +# The size for fleecing images needs to be exactly the same size as QEMU sees. E.g. EFI disk can bex
>> +# attached with a smaller size then the underyling image on the storage.
>> +sub query_block_node_sizes {
>> +    my ($self, $vmid, $task) = @_;
>> +
>> +    my $block_info = mon_cmd($vmid, "query-block");
>> +    $block_info = { map { $_->{device} => $_ } $block_info->@* };
>> +
>> +    for my $diskinfo ($task->{disks}->@*) {
> 
> only usage of $task
> 
> so we don't actually need to add $task as parameter to the two existing
> subs, but can just modify this here to take $task->{disks} directly? or
> did I overlook something?
> 
> if we do have to keep $task as parameter, it should come before $vmid in
> the argument list, to be consistent with the rest..
> 
> other than that, consider this patch
> 

Right, since $task->{disks} is itself a reference, this should work out
fine :)


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace
  2024-11-11 18:33   ` Thomas Lamprecht
@ 2024-11-12 10:19     ` Fiona Ebner
  0 siblings, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-12 10:19 UTC (permalink / raw)
  To: Thomas Lamprecht, Proxmox VE development discussion

On 11.11.24 7:33 PM, Thomas Lamprecht wrote:
> Am 07.11.24 um 17:51 schrieb Fiona Ebner:
>> +package PVE::Env;
> 
> can this module and it's name be more specific to doing stuff with/in namespaces?
> 
> e.g. PVE::Namespaces or PVE::Sys::Namespaces (there might be other stuff that might
> fit well in a future libproxmox-sys-perl and Proxmox::Sys::* respectively, so
> maybe that module path would be better?)
> 
> I'd also make all sub's private if not really intended to be used outside
> this module.
> 

Will do!

> If the more general fork/wait-child helpers are needed elsewhere, or deemed
> to be useful, then they could go in their own module, like e.g. PVE::Sys::Process
> 

Since you already pointed out a potential user, will go for this too.

>> +sub forked(&%) {
> 
> FWIW, there's some "forked" method in test/lock_file.pl that this might replace too,
> if it stay public.
> 

I'll check if it can be adapted easily.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers
  2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers Fiona Ebner
@ 2024-11-12 12:27   ` Fabian Grünbichler
  2024-11-12 14:35     ` Fiona Ebner
  0 siblings, 1 reply; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12 12:27 UTC (permalink / raw)
  To: Proxmox VE development discussion

some nits/comments/questions below, but the general direction/structure
already looks quite good I think!

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> The state of the VM's disk images at the time the backup is started is
> preserved via a snapshot-access block node. Old data is moved to the
> fleecing image when new guest writes come in. The snapshot-access
> block node, as well as the associated bitmap in case of incremental
> backup, will be made available to the external provider. They are
> exported via NBD and for 'nbd' mechanism, the NBD socket path is
> passed to the provider, while for 'block-device' mechanism, the NBD
> export is made accessible as a regular block device first and the
> bitmap information is made available via a $next_dirty_region->()
> function. For 'block-device', the 'nbdinfo' binary is required.
> 
> The provider can indicate that it wants to do an incremental backup by
> returning the bitmap ID that was used for a previous backup and it
> will then be told if the bitmap was newly created (either first backup
> or old bitmap was invalid) or if the bitmap can be reused.
> 
> The provider then reads the parts of the NBD or block device it needs,
> either the full disk for full backup, or the dirty parts according to
> the bitmap for incremental backup. The bitmap has to be respected,
> reads to other parts of the image will return an error. After backing
> up each part of the disk, it should be discarded in the export to
> avoid unnecessary space usage in the fleecing image (requires the
> storage underlying the fleecing image to support discard too).
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v3:
> * adapt to API changes, config files are now passed as raw
> 
>  PVE/VZDump/QemuServer.pm | 309 ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 308 insertions(+), 1 deletion(-)
> 
> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
> index b6dcd6cc..d0218c9b 100644
> --- a/PVE/VZDump/QemuServer.pm
> +++ b/PVE/VZDump/QemuServer.pm
> @@ -20,7 +20,7 @@ use PVE::QMPClient;
>  use PVE::Storage::Plugin;
>  use PVE::Storage::PBSPlugin;
>  use PVE::Storage;
> -use PVE::Tools;
> +use PVE::Tools qw(run_command);
>  use PVE::VZDump;
>  use PVE::Format qw(render_duration render_bytes);
>  
> @@ -277,6 +277,8 @@ sub archive {
>  
>      if ($self->{vzdump}->{opts}->{pbs}) {
>  	$self->archive_pbs($task, $vmid);
> +    } elsif ($self->{vzdump}->{'backup-provider'}) {
> +	$self->archive_external($task, $vmid);
>      } else {
>  	$self->archive_vma($task, $vmid, $filename, $comp);
>      }
> @@ -1149,6 +1151,23 @@ sub cleanup {
>  
>      # If VM was started only for backup, it is already stopped now.
>      if (PVE::QemuServer::Helpers::vm_running_locally($vmid)) {
> +	if ($task->{cleanup}->{'nbd-stop'}) {
> +	    eval { PVE::QemuServer::QMPHelpers::nbd_stop($vmid); };
> +	    $self->logerr($@) if $@;
> +	}
> +
> +	if (my $info = $task->{cleanup}->{'backup-access-teardown'}) {
> +	    my $params = {
> +		'target-id' => $info->{'target-id'},
> +		timeout => 60,
> +		success => $info->{success} ? JSON::true : JSON::false,
> +	    };
> +
> +	    $self->loginfo("tearing down backup-access");
> +	    eval { mon_cmd($vmid, "backup-access-teardown", $params->%*) };
> +	    $self->logerr($@) if $@;
> +	}
> +
>  	$detach_tpmstate_drive->($task, $vmid);
>  	detach_fleecing_images($task->{disks}, $vmid) if $task->{'use-fleecing'};
>      }
> @@ -1160,4 +1179,292 @@ sub cleanup {
>      }
>  }
>  
> +my sub block_device_backup_cleanup {
> +    my ($self, $paths, $cpids) = @_;
> +
> +    for my $path ($paths->@*) {
> +	eval { run_command(["qemu-nbd", "-d", $path ]); };
> +	$self->log('warn', "unable to disconnect NBD backup source '$path' - $@") if $@;
> +    }
> +
> +    my $waited;
> +    my $wait_limit = 5;
> +    for ($waited = 0; $waited < $wait_limit && scalar(keys $cpids->%*); $waited++) {
> +	while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) {
> +	    delete($cpids->{$cpid});
> +	}
> +	if ($waited == 0) {
> +	    kill 15, $_ for keys $cpids->%*;
> +	}
> +	sleep 1;
> +    }
> +    if ($waited == $wait_limit && scalar(keys $cpids->%*)) {
> +	kill 9, $_ for keys $cpids->%*;
> +	sleep 1;
> +	while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) {

this could be a bit dangerous, since we have an explicit list of cpids
we want to wait for, couldn't we just waitpid explicitly for them?

just wary of potential side-effects on things like hookscripts or future
features that also require forking ;)

> +	    delete($cpids->{$cpid});
> +	}
> +	$self->log('warn', "unable to collect nbdinfo child process '$_'") for keys $cpids->%*;
> +    }
> +}
> +
> +my sub block_device_backup_prepare {
> +    my ($self, $devicename, $size, $nbd_path, $bitmap_name, $count) = @_;

nit: $device_name for consistency's sake?

> +
> +    my $nbd_info_uri = "nbd+unix:///${devicename}?socket=${nbd_path}";
> +    my $qemu_nbd_uri = "nbd:unix:${nbd_path}:exportname=${devicename}";
> +
> +    my $cpid;
> +    my $error_fh;
> +    my $next_dirty_region;
> +
> +    # If there is no dirty bitmap, it can be treated as if there's a full dirty one. The output of
> +    # nbdinfo is a list of tuples with offset, length, type, description. The first bit of 'type' is
> +    # set when the bitmap is dirty, see QEMU's docs/interop/nbd.txt
> +    my $dirty_bitmap = [];
> +    if ($bitmap_name) {
> +	my $input = IO::File->new();
> +	my $info = IO::File->new();
> +	$error_fh = IO::File->new();
> +	my $nbdinfo_cmd = ["nbdinfo", $nbd_info_uri, "--map=qemu:dirty-bitmap:${bitmap_name}"];
> +	$cpid = open3($input, $info, $error_fh, $nbdinfo_cmd->@*)
> +	    or die "failed to spawn nbdinfo child - $!\n";
> +
> +	$next_dirty_region = sub {
> +	    my ($offset, $length, $type);
> +	    do {
> +		my $line = <$info>;
> +		return if !$line;
> +		die "unexpected output from nbdinfo - $line\n"
> +		    if $line !~ m/^\s*(\d+)\s*(\d+)\s*(\d+)/; # also untaints
> +		($offset, $length, $type) = ($1, $2, $3);
> +	    } while (($type & 0x1) == 0); # not dirty
> +	    return ($offset, $length);
> +	};
> +    } else {
> +	my $done = 0;
> +	$next_dirty_region = sub {
> +	    return if $done;
> +	    $done = 1;
> +	    return (0, $size);
> +	};
> +    }
> +
> +    my $blockdev = "/dev/nbd${count}";

what if that is already used/taken by somebody? I think we'd need logic
to find a free slot here..

> +
> +    eval {
> +	run_command(["qemu-nbd", "-c", $blockdev, $qemu_nbd_uri, "--format=raw", "--discard=on"]);
> +    };
> +    if (my $err = $@) {
> +	my $cpids = {};
> +	$cpids->{$cpid} = 1 if $cpid;
> +	block_device_backup_cleanup($self, [$blockdev], $cpids);
> +	die $err;
> +    }
> +
> +    return ($blockdev, $next_dirty_region, $cpid);
> +}
> +
> +my sub backup_access_to_volume_info {
> +    my ($self, $backup_access_info, $mechanism, $nbd_path) = @_;
> +
> +    my $child_pids = {}; # used for nbdinfo calls
> +    my $count = 0; # counter for block devices, i.e. /dev/nbd${count}
> +    my $volumes = {};
> +
> +    for my $info ($backup_access_info->@*) {
> +	my $bitmap_status = 'none';
> +	my $bitmap_name;
> +	if (my $bitmap_action = $info->{'bitmap-action'}) {
> +	    my $bitmap_action_to_status = {
> +		'not-used' => 'none',
> +		'not-used-removed' => 'none',
> +		'new' => 'new',
> +		'used' => 'reuse',
> +		'invalid' => 'new',
> +	    };

nit: should we move this outside of the loop? it's a static map after
all.. (or maybe the perl interpreter is smart enough anyway ;))

> +
> +	    $bitmap_status = $bitmap_action_to_status->{$bitmap_action}
> +		or die "got unexpected bitmap action '$bitmap_action'\n";
> +
> +	    $bitmap_name = $info->{'bitmap-name'} or die "bitmap-name is not present\n";
> +	}
> +
> +	my ($device, $size) = $info->@{qw(device size)};
> +
> +	$volumes->{$device}->{'bitmap-mode'} = $bitmap_status;
> +	$volumes->{$device}->{size} = $size;
> +
> +	if ($mechanism eq 'block-device') {
> +	    my ($blockdev, $next_dirty_region, $child_pid) = block_device_backup_prepare(
> +		$self, $device, $size, $nbd_path, $bitmap_name, $count);
> +	    $count++;
> +	    $child_pids->{$child_pid} = 1 if $child_pid;
> +	    $volumes->{$device}->{path} = $blockdev;
> +	    $volumes->{$device}->{'next-dirty-region'} = $next_dirty_region;
> +	} elsif ($mechanism eq 'nbd') {
> +	    $volumes->{$device}->{'nbd-path'} = $nbd_path;
> +	    $volumes->{$device}->{'bitmap-name'} = $bitmap_name;
> +	} else {
> +	    die "internal error - unkown mechanism '$mechanism'";
> +	}
> +    }
> +
> +    return ($volumes, $child_pids);
> +}
> +
> +sub archive_external {
> +    my ($self, $task, $vmid) = @_;
> +
> +    my $guest_config = PVE::Tools::file_get_contents("$task->{tmpdir}/qemu-server.conf");
> +    my $firewall_file = "$task->{tmpdir}/qemu-server.fw";
> +
> +    my $opts = $self->{vzdump}->{opts};
> +
> +    my $backup_provider = $self->{vzdump}->{'backup-provider'};
> +
> +    $self->loginfo("starting external backup via " . $backup_provider->provider_name());
> +
> +    my $starttime = time();
> +
> +    # get list early so we die on unkown drive types before doing anything
> +    my $devlist = _get_task_devlist($task);
> +
> +    $self->enforce_vm_running_for_backup($vmid);
> +    $self->{qmeventd_fh} = PVE::QemuServer::register_qmeventd_handle($vmid);
> +
> +    eval {
> +	$SIG{INT} = $SIG{TERM} = $SIG{QUIT} = $SIG{HUP} = $SIG{PIPE} = sub {
> +	    die "interrupted by signal\n";
> +	};
> +
> +	my $qemu_support = mon_cmd($vmid, "query-proxmox-support");
> +
> +	$attach_tpmstate_drive->($self, $task, $vmid);
> +
> +	my $is_template = PVE::QemuConfig->is_template($self->{vmlist}->{$vmid});
> +
> +	my $fleecing = check_and_prepare_fleecing(
> +	    $self, $task, $vmid, $opts->{fleecing}, $task->{disks}, $is_template, $qemu_support, 1);
> +	die "cannot setup backup access without fleecing\n" if !$fleecing;
> +
> +	$task->{'use-fleecing'} = 1;
> +
> +	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);

should we move this (A)

> +
> +	my $target_id = $opts->{storage};
> +
> +	my $params = {
> +	    'target-id' => $target_id,
> +	    devlist => $devlist,
> +	    timeout => 60,
> +	};

and this (B)

> +
> +	my ($mechanism, $bitmap_name) = $backup_provider->backup_get_mechanism($vmid, 'qemu');
> +	die "mechanism '$mechanism' requested by backup provider is not supported for VMs\n"
> +	    if $mechanism ne 'block-device' && $mechanism ne 'nbd';
> +
> +	if ($mechanism eq 'block-device') {
> +	    # For mechanism 'block-device' the bitmap needs to be passed to the provider. The bitmap
> +	    # cannot be dumped via QMP and doing it via qemu-img is experimental, so use nbdinfo.
> +	    die "need 'nbdinfo' binary from package libnbd-bin\n" if !-e "/usr/bin/nbdinfo";
> +
> +	    # NOTE nbds_max won't change if module is already loaded
> +	    run_command(["modprobe", "nbd", "nbds_max=128"]);

should this maybe be put into a modprobe snippet somewhere, and we just
verify here that nbd is available? not that we can currently reach 128
guest disks ;)

> +	}

down here (B)

> +
> +	if ($bitmap_name) {
> +	    # prepend storage ID so different providers can never cause clashes
> +	    $bitmap_name = "$opts->{storage}-" . $bitmap_name;
> +	    $params->{'bitmap-name'} = $bitmap_name;

not related to this patch directly - if we do this for external
providers, do we also want to do it for different PBS targets maybe? :)

> +	}
> +
> +	$self->loginfo("setting up snapshot-access for backup");
> +

and down here (A)?

> +	my $backup_access_info = eval { mon_cmd($vmid, "backup-access-setup", $params->%*) };
> +	my $qmperr = $@;
> +
> +	$task->{cleanup}->{'backup-access-teardown'} = { 'target-id' => $target_id, success => 0 };

should we differentiate here between setup success or failure? if not,
should we move it directly before the setup call?

> +
> +	if ($fs_frozen) {
> +	    $self->qga_fs_thaw($vmid);
> +	}
> +
> +	die $qmperr if $qmperr;
> +
> +	$self->resume_vm_after_job_start($task, $vmid);
> +
> +	my $bitmap_info = mon_cmd($vmid, 'query-pbs-bitmap-info');
> +	for my $info (sort { $a->{drive} cmp $b->{drive} } $bitmap_info->@*) {
> +	    my $text = $bitmap_action_to_human->($self, $info);
> +	    my $drive = $info->{drive};
> +	    $drive =~ s/^drive-//; # for consistency
> +	    $self->loginfo("$drive: dirty-bitmap status: $text");
> +	}
> +
> +	$self->loginfo("starting NBD server");
> +
> +	my $nbd_path = "/run/qemu-server/$vmid\_nbd.backup_access";
> +	mon_cmd(
> +	    $vmid, "nbd-server-start", addr => { type => 'unix', data => { path => $nbd_path } } );
> +	$task->{cleanup}->{'nbd-stop'} = 1;
> +
> +	for my $info ($backup_access_info->@*) {
> +	    $self->loginfo("adding NBD export for $info->{device}");
> +
> +	    my $export_params = {
> +		id => $info->{device},
> +		'node-name' => $info->{'node-name'},
> +		writable => JSON::true, # for discard
> +		type => "nbd",
> +		name => $info->{device}, # NBD export name
> +	    };
> +
> +	    if ($info->{'bitmap-name'}) {
> +		$export_params->{bitmaps} = [{
> +		    node => $info->{'bitmap-node-name'},
> +		    name => $info->{'bitmap-name'},
> +		}],
> +	    }
> +
> +	    mon_cmd($vmid, "block-export-add", $export_params->%*);
> +	}
> +
> +	my $child_pids = {}; # used for nbdinfo calls
> +	my $volumes = {};
> +
> +	eval {
> +	    ($volumes, $child_pids) =
> +		backup_access_to_volume_info($self, $backup_access_info, $mechanism, $nbd_path);

so this here forks child processes (via block_device_backup_prepare),
but it might fail halfway through after having forked X/N children, then
we don't have any information about the forked processes here (C)

> +
> +	    my $param = {};
> +	    $param->{'bandwidth-limit'} = $opts->{bwlimit} * 1024 if $opts->{bwlimit};
> +	    $param->{'firewall-config'} = PVE::Tools::file_get_contents($firewall_file)
> +		if -e $firewall_file;
> +
> +	    $backup_provider->backup_vm($vmid, $guest_config, $volumes, $param);
> +	};
> +	my $err = $@;
> +
> +	if ($mechanism eq 'block-device') {
> +	    my $cleanup_paths = [map { $volumes->{$_}->{path} } keys $volumes->%*];
> +	    block_device_backup_cleanup($self, $cleanup_paths, $child_pids)

C: to do this cleanup here.. should we maybe record both cpids and
volumes as part of $self->{cleanup}, instead of returning them, so that
we can handle that case as well?

> +	}
> +
> +	die $err if $err;
> +    };
> +    my $err = $@;
> +
> +    if ($err) {
> +	$self->logerr($err);
> +	$self->resume_vm_after_job_start($task, $vmid);
> +    } else {
> +	$task->{size} = $backup_provider->backup_get_task_size($vmid);
> +	$task->{cleanup}->{'backup-access-teardown'}->{success} = 1;
> +    }
> +    $self->restore_vm_power_state($vmid);
> +
> +    die $err if $err;
> +}
> +
>  1;
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace
  2024-11-07 16:51 ` [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace Fiona Ebner
  2024-11-11 18:33   ` Thomas Lamprecht
@ 2024-11-12 14:20   ` Fabian Grünbichler
  2024-11-13 10:08     ` Fiona Ebner
  1 sibling, 1 reply; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12 14:20 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> The first use case is running the container backup subroutine for
> external providers inside a user namespace. That allows them to see
> the filesystem to back-up from the containers perspective and also
> improves security because of isolation.
> 
> Copied and adapted the relevant parts from the pve-buildpkg
> repository.
> 
> Originally-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
> [FE: add $idmap parameter, drop $aux_groups parameter]
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> New in v3.
> 
>  src/Makefile   |   1 +
>  src/PVE/Env.pm | 136 +++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 137 insertions(+)
>  create mode 100644 src/PVE/Env.pm
> 
> diff --git a/src/Makefile b/src/Makefile
> index 2d8bdc4..dba26e3 100644
> --- a/src/Makefile
> +++ b/src/Makefile
> @@ -15,6 +15,7 @@ LIB_SOURCES = \
>  	Certificate.pm \
>  	CpuSet.pm \
>  	Daemon.pm \
> +	Env.pm \
>  	Exception.pm \
>  	Format.pm \
>  	INotify.pm \
> diff --git a/src/PVE/Env.pm b/src/PVE/Env.pm
> new file mode 100644
> index 0000000..e11bec0
> --- /dev/null
> +++ b/src/PVE/Env.pm
> @@ -0,0 +1,136 @@
> +package PVE::Env;

I agree with Thomas that this name might be a bit too generic ;)

I also wonder - since this seems to be only used in pve-container, and
it really mostly makes sense in that context, wouldn't it be better off
there? or do we expect other areas where we need userns handling?
(granted, some of the comments below would require other changes to
pve-common anyway ;))

> +
> +use strict;
> +use warnings;
> +
> +use Fcntl qw(O_WRONLY);
> +use POSIX qw(EINTR);
> +use Socket;
> +
> +require qw(syscall.ph);

PVE::Syscall already does this, and has the following:

BEGIN {
    die "syscall.ph can only be required once!\n" if $INC{'syscall.ph'};
    require("syscall.ph");

don't those two clash? I think those syscall related parts should
probably move there?

> +
> +use constant {CLONE_NEWNS   => 0x00020000,
> +              CLONE_NEWUSER => 0x10000000};
> +
> +sub unshare($) {
> +    my ($flags) = @_;
> +    return 0 == syscall(272, $flags);
> +}

this is PVE::Tools::unshare, maybe the latter should move here?

> +
> +sub __set_id_map($$$) {
> +    my ($pid, $what, $value) = @_;
> +    sysopen(my $fd, "/proc/$pid/${what}_map", O_WRONLY)
> +	or die "failed to open child process' ${what}_map\n";
> +    my $rc = syswrite($fd, $value);
> +    if (!$rc || $rc != length($value)) {
> +	die "failed to set sub$what: $!\n";
> +    }
> +    close($fd);
> +}
> +
> +sub set_id_map($$) {
> +    my ($pid, $id_map) = @_;
> +
> +    my $gid_map = '';
> +    my $uid_map = '';
> +
> +    for my $map ($id_map->@*) {
> +	my ($type, $ct, $host, $length) = $map->@*;
> +
> +	$gid_map .= "$ct $host $length\n" if $type eq 'g';
> +	$uid_map .= "$ct $host $length\n" if $type eq 'u';
> +    }
> +
> +    __set_id_map($pid, 'gid', $gid_map) if $gid_map;
> +    __set_id_map($pid, 'uid', $uid_map) if $uid_map;
> +}

do we gain a lot here from not just using newuidmap/newgidmap?

> +
> +sub wait_for_child($;$) {
> +    my ($pid, $noerr) = @_;
> +    my $interrupts = 0;
> +    while (waitpid($pid, 0) != $pid) {
> +	if ($! == EINTR) {
> +	    warn "interrupted...\n";
> +	    kill(($interrupts > 3 ? 9 : 15), $pid);
> +	    $interrupts++;
> +	}
> +    }
> +    my $status = POSIX::WEXITSTATUS($?);
> +    return $status if $noerr;
> +
> +    if ($? == -1) {
> +	die "failed to execute\n";
> +    } elsif (POSIX::WIFSIGNALED($?)) {
> +	my $sig = POSIX::WTERMSIG($?);
> +	die "got signal $sig\n";
> +    } elsif ($status != 0) {
> +	warn "exit code $status\n";
> +    }
> +    return $status;
> +}
> +
> +sub forked(&%) {

this seems very similar to the already existing PVE::Tools::run_fork /
run_fork_with_timeout helpers.. any reason we can't extend those with
`afterfork` support and use them?

> +    my ($code, %opts) = @_;
> +
> +    pipe(my $except_r, my $except_w) or die "pipe: $!\n";
> +
> +    my $pid = fork();
> +    die "fork failed: $!\n" if !defined($pid);
> +
> +    if ($pid == 0) {
> +	close($except_r);
> +	eval { $code->() };
> +	if ($@) {
> +	    print {$except_w} $@;
> +	    $except_w->flush();
> +	    POSIX::_exit(1);
> +	}
> +	POSIX::_exit(0);
> +    }
> +    close($except_w);
> +
> +    my $err;
> +    if (my $afterfork = $opts{afterfork}) {
> +	eval { $afterfork->($pid); };
> +	if ($err = $@) {
> +	    kill(15, $pid);
> +	    $opts{noerr} = 1;
> +	}
> +    }
> +    if (!$err) {
> +	$err = do { local $/ = undef; <$except_r> };
> +    }
> +    my $rv = wait_for_child($pid, $opts{noerr});
> +    die $err if $err;
> +    die "an unknown error occurred\n" if $rv != 0;
> +    return $rv;
> +}
> +
> +sub run_in_userns(&;$) {
> +    my ($code, $id_map) = @_;
> +    socketpair(my $sp, my $sc, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
> +	or die "socketpair: $!\n";
> +    forked(sub {
> +	close($sp);
> +	unshare(CLONE_NEWUSER|CLONE_NEWNS) or die "unshare(NEWUSER|NEWNS): $!\n";

I guess we can't set our "own" maps here for lack of capabilities and
avoid the whole afterfork thing entirely? at least I couldn't get it to
work ;)

> +	syswrite($sc, "1\n") == 2 or die "write: $!\n";
> +	shutdown($sc, 1);
> +	my $two = <$sc>;
> +	die "failed to sync with parent process\n" if $two ne "2\n";
> +	close($sc);
> +	$! = undef;
> +	($(, $)) = (0, 0); die "$!\n" if $!;
> +	($<, $>) = (0, 0); die "$!\n" if $!;
> +	$code->();
> +    }, afterfork => sub {
> +	my ($pid) = @_;
> +	close($sc);
> +	my $one = <$sp>;
> +	die "failed to sync with userprocess\n" if $one ne "1\n";
> +	set_id_map($pid, $id_map);
> +	syswrite($sp, "2\n") == 2 or die "write: $!\n";
> +	close($sp);
> +    });
> +}
> +
> +1;
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers
  2024-11-12 12:27   ` Fabian Grünbichler
@ 2024-11-12 14:35     ` Fiona Ebner
  2024-11-12 15:17       ` Fabian Grünbichler
  0 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-12 14:35 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

On 12.11.24 1:27 PM, Fabian Grünbichler wrote:
> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>> +    if ($waited == $wait_limit && scalar(keys $cpids->%*)) {
>> +	kill 9, $_ for keys $cpids->%*;
>> +	sleep 1;
>> +	while ((my $cpid = waitpid(-1, POSIX::WNOHANG)) > 0) {
> 
> this could be a bit dangerous, since we have an explicit list of cpids
> we want to wait for, couldn't we just waitpid explicitly for them?
> 
> just wary of potential side-effects on things like hookscripts or future
> features that also require forking ;)
> 

Will do!

>> +	    delete($cpids->{$cpid});
>> +	}
>> +	$self->log('warn', "unable to collect nbdinfo child process '$_'") for keys $cpids->%*;
>> +    }
>> +}
>> +
>> +my sub block_device_backup_prepare {
>> +    my ($self, $devicename, $size, $nbd_path, $bitmap_name, $count) = @_;
> 
> nit: $device_name for consistency's sake?
> 

Will rename, as well as in the API and backup provider plugins.

---snip---

>> +
>> +    my $blockdev = "/dev/nbd${count}";
> 
> what if that is already used/taken by somebody? I think we'd need logic
> to find a free slot here..
> 

Then the command will fail. I haven't found an obvious way to see
whether it is in-use yet, except interpreting a failure to mean that
(which it doesn't necessarily). We could go for that, but I'll take
another look if there is a better way.

>> +
>> +    eval {
>> +	run_command(["qemu-nbd", "-c", $blockdev, $qemu_nbd_uri, "--format=raw", "--discard=on"]);
>> +    };

---snip---

>> +    for my $info ($backup_access_info->@*) {
>> +	my $bitmap_status = 'none';
>> +	my $bitmap_name;
>> +	if (my $bitmap_action = $info->{'bitmap-action'}) {
>> +	    my $bitmap_action_to_status = {
>> +		'not-used' => 'none',
>> +		'not-used-removed' => 'none',
>> +		'new' => 'new',
>> +		'used' => 'reuse',
>> +		'invalid' => 'new',
>> +	    };
> 
> nit: should we move this outside of the loop? it's a static map after
> all.. (or maybe the perl interpreter is smart enough anyway ;))
> 

Ack.

---snip---

>> +
>> +	my $fs_frozen = $self->qga_fs_freeze($task, $vmid);
> 
> should we move this (A)
> 
>> +
>> +	my $target_id = $opts->{storage};
>> +
>> +	my $params = {
>> +	    'target-id' => $target_id,
>> +	    devlist => $devlist,
>> +	    timeout => 60,
>> +	};
> 
> and this (B)
> 
>> +
>> +	my ($mechanism, $bitmap_name) = $backup_provider->backup_get_mechanism($vmid, 'qemu');
>> +	die "mechanism '$mechanism' requested by backup provider is not supported for VMs\n"
>> +	    if $mechanism ne 'block-device' && $mechanism ne 'nbd';
>> +
>> +	if ($mechanism eq 'block-device') {
>> +	    # For mechanism 'block-device' the bitmap needs to be passed to the provider. The bitmap
>> +	    # cannot be dumped via QMP and doing it via qemu-img is experimental, so use nbdinfo.
>> +	    die "need 'nbdinfo' binary from package libnbd-bin\n" if !-e "/usr/bin/nbdinfo";
>> +
>> +	    # NOTE nbds_max won't change if module is already loaded
>> +	    run_command(["modprobe", "nbd", "nbds_max=128"]);
> 
> should this maybe be put into a modprobe snippet somewhere, and we just
> verify here that nbd is available? not that we can currently reach 128
> guest disks ;)
> 

Will look into it!

>> +	}
> 
> down here (B)
> 

Ack.

>> +
>> +	if ($bitmap_name) {
>> +	    # prepend storage ID so different providers can never cause clashes
>> +	    $bitmap_name = "$opts->{storage}-" . $bitmap_name;
>> +	    $params->{'bitmap-name'} = $bitmap_name;
> 
> not related to this patch directly - if we do this for external
> providers, do we also want to do it for different PBS targets maybe? :)
> 

Yes, I thought about that too. Will need a QEMU patch to support it with
the 'backup' QMP command, but don't see any real blocker :)

>> +	}
>> +
>> +	$self->loginfo("setting up snapshot-access for backup");
>> +
> 
> and down here (A)?

Agreed, but I'll move it before the log line ;)

>> +	my $backup_access_info = eval { mon_cmd($vmid, "backup-access-setup", $params->%*) };
>> +	my $qmperr = $@;
>> +
>> +	$task->{cleanup}->{'backup-access-teardown'} = { 'target-id' => $target_id, success => 0 };
> 
> should we differentiate here between setup success or failure? if not,
> should we move it directly before the setup call?
> 

No, there should be no differentiation. The teardown QMP command needs
to be called in any case. But how could it happen that we do reach
cleanup but haven't gotten through here after the setup call? The setup
call is in an eval and there is nothing that can die in between. I can
still move it if you feel that is cleaner.

--snip---

>> +	my $child_pids = {}; # used for nbdinfo calls
>> +	my $volumes = {};
>> +
>> +	eval {
>> +	    ($volumes, $child_pids) =
>> +		backup_access_to_volume_info($self, $backup_access_info, $mechanism, $nbd_path);
> 
> so this here forks child processes (via block_device_backup_prepare),
> but it might fail halfway through after having forked X/N children, then
> we don't have any information about the forked processes here (C)
> 
>> +
>> +	    my $param = {};
>> +	    $param->{'bandwidth-limit'} = $opts->{bwlimit} * 1024 if $opts->{bwlimit};
>> +	    $param->{'firewall-config'} = PVE::Tools::file_get_contents($firewall_file)
>> +		if -e $firewall_file;
>> +
>> +	    $backup_provider->backup_vm($vmid, $guest_config, $volumes, $param);
>> +	};
>> +	my $err = $@;
>> +
>> +	if ($mechanism eq 'block-device') {
>> +	    my $cleanup_paths = [map { $volumes->{$_}->{path} } keys $volumes->%*];
>> +	    block_device_backup_cleanup($self, $cleanup_paths, $child_pids)
> 
> C: to do this cleanup here.. should we maybe record both cpids and
> volumes as part of $self->{cleanup}, instead of returning them, so that
> we can handle that case as well?
> 

Good catch!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers
  2024-11-12 14:35     ` Fiona Ebner
@ 2024-11-12 15:17       ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12 15:17 UTC (permalink / raw)
  To: Fiona Ebner, Proxmox VE development discussion

On November 12, 2024 3:35 pm, Fiona Ebner wrote:
> On 12.11.24 1:27 PM, Fabian Grünbichler wrote:
>> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>>> +	my $backup_access_info = eval { mon_cmd($vmid, "backup-access-setup", $params->%*) };
>>> +	my $qmperr = $@;
>>> +
>>> +	$task->{cleanup}->{'backup-access-teardown'} = { 'target-id' => $target_id, success => 0 };
>> 
>> should we differentiate here between setup success or failure? if not,
>> should we move it directly before the setup call?
>> 
> 
> No, there should be no differentiation. The teardown QMP command needs
> to be called in any case. But how could it happen that we do reach
> cleanup but haven't gotten through here after the setup call? The setup
> call is in an eval and there is nothing that can die in between. I can
> still move it if you feel that is cleaner.

yeah, this is mostly about other stuff being added in-between later on.
probably not too likely in this case, but I always prefer setting a
cleanup-potentially-required flag *before* doing the thing that
potentially requires cleanup, rather than after (even if the current
code does everything right). it's the safer variant (even though it of
course also has potential for stuff being added in-between, and then
triggering cleanup without the cleanup-source actually having happened
;))


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] applied: [PATCH container v3 25/34] create: add missing include of PVE::Storage::Plugin
  2024-11-07 16:51 ` [pve-devel] [PATCH container v3 25/34] create: add missing include of PVE::Storage::Plugin Fiona Ebner
@ 2024-11-12 15:22   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12 15:22 UTC (permalink / raw)
  To: Proxmox VE development discussion

thanks!

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> used for the shared 'COMMON_TAR_FLAGS' variable.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> New in v3.
> 
>  src/PVE/LXC/Create.pm | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
> index 117103c..7c5bf0a 100644
> --- a/src/PVE/LXC/Create.pm
> +++ b/src/PVE/LXC/Create.pm
> @@ -8,6 +8,7 @@ use Fcntl;
>  
>  use PVE::RPCEnvironment;
>  use PVE::Storage::PBSPlugin;
> +use PVE::Storage::Plugin;
>  use PVE::Storage;
>  use PVE::DataCenterConfig;
>  use PVE::LXC;
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] partially-applied: [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API
  2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
                   ` (33 preceding siblings ...)
  2024-11-07 16:51 ` [pve-devel] [RFC manager v3 34/34] backup: implement backup for external providers Fiona Ebner
@ 2024-11-12 15:50 ` Thomas Lamprecht
  34 siblings, 0 replies; 63+ messages in thread
From: Thomas Lamprecht @ 2024-11-12 15:50 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 07.11.24 um 17:51 schrieb Fiona Ebner:
> Fiona Ebner (9):
>   block/reqlist: allow adding overlapping requests
>   PVE backup: fixup error handling for fleecing
>   PVE backup: factor out setting up snapshot access for fleecing
>   PVE backup: save device name in device info structure
>   PVE backup: include device name in error when setting up snapshot
>     access fails


Applied above QEMU patches already.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC container v3 28/34] backup: implement restore for external providers
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 28/34] backup: implement restore for external providers Fiona Ebner
@ 2024-11-12 16:27   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12 16:27 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> First, the provider is asked about what restore mechanism to use.
> Currently, 'directory' and 'tar' are possible, for restoring either
> from a directory containing the full filesystem structure (for which
> rsync is used) or a potentially compressed tar file containing the
> same.

nit: this is outdated, directory uses tar as transport/restore mechanism
as well now :)

> 
> The new functions are copied and adapted from the existing ones for
> PBS or tar and it might be worth to factor out the common parts.
> 
> Restore of containers as privileged are prohibited, because the
> archives from an external provider are considered less trusted than
> from Proxmox VE storages. If ever allowing that in the future, at
> least it would be worth extracting the tar archive in a restricted
> context (e.g. user namespace with ID mapped mount or seccomp).
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v3:
> * Use user namespace when restoring directory (and use tar instead of
>   rsync, because it is easier to split in privileged and unprivileged
>   half)
> 
>  src/PVE/LXC/Create.pm | 141 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 141 insertions(+)
> 
> diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
> index 8c8cb9a..8657ac1 100644
> --- a/src/PVE/LXC/Create.pm
> +++ b/src/PVE/LXC/Create.pm
> @@ -7,6 +7,7 @@ use File::Path;
>  use Fcntl;
>  
>  use PVE::RPCEnvironment;
> +use PVE::RESTEnvironment qw(log_warn);
>  use PVE::Storage::PBSPlugin;
>  use PVE::Storage::Plugin;
>  use PVE::Storage;
> @@ -26,6 +27,24 @@ sub restore_archive {
>  	if ($scfg->{type} eq 'pbs') {
>  	    return restore_proxmox_backup_archive($storage_cfg, $archive, $rootdir, $conf, $no_unpack_error, $bwlimit);
>  	}
> +	if (PVE::Storage::storage_has_feature($storage_cfg, $storeid, 'backup-provider')) {
> +	    my $log_function = sub {
> +		my ($log_level, $message) = @_;
> +		my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
> +		print "$prefix: $message\n";
> +	    };
> +	    my $backup_provider =
> +		PVE::Storage::new_backup_provider($storage_cfg, $storeid, $log_function);
> +	    return restore_external_archive(
> +		$backup_provider,
> +		$storeid,
> +		$volname,
> +		$rootdir,
> +		$conf,
> +		$no_unpack_error,
> +		$bwlimit,
> +	    );
> +	}
>      }
>  
>      $archive = PVE::Storage::abs_filesystem_path($storage_cfg, $archive) if $archive ne '-';
> @@ -127,6 +146,54 @@ sub restore_tar_archive {
>      die $err if $err && !$no_unpack_error;
>  }
>  
> +sub restore_external_archive {
> +    my ($backup_provider, $storeid, $volname, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_;
> +
> +    die "refusing to restore privileged container backup from external source\n"
> +	if !$conf->{unprivileged};
> +
> +    my ($mechanism, $vmtype) = $backup_provider->restore_get_mechanism($volname, $storeid);
> +    die "cannot restore non-LXC guest of type '$vmtype'\n" if $vmtype ne 'lxc';
> +
> +    my $info = $backup_provider->restore_container_init($volname, $storeid, {});
> +    eval {
> +	if ($mechanism eq 'tar') {
> +	    my $tar_path = $info->{'tar-path'}
> +		or die "did not get path to tar file from backup provider\n";
> +	    die "not a regular file '$tar_path'" if !-f $tar_path;
> +	    restore_tar_archive($tar_path, $rootdir, $conf, $no_unpack_error, $bwlimit);
> +	} elsif ($mechanism eq 'directory') {
> +	    my $directory = $info->{'archive-directory'}
> +		or die "did not get path to archive directory from backup provider\n";
> +	    die "not a directory '$directory'" if !-d $directory;
> +
> +	    my $create_cmd = [
> +		'tar',
> +		'cpf',
> +		'-',
> +		@PVE::Storage::Plugin::COMMON_TAR_FLAGS,
> +		"--directory=$directory",
> +		'.',
> +	    ];
> +
> +	    my $extract_cmd = restore_tar_archive_command($conf, undef, $rootdir, $bwlimit);
> +
> +	    eval { PVE::Tools::run_command([$create_cmd, $extract_cmd]); };
> +	    die $@ if $@ && !$no_unpack_error;
> +	} else {
> +	    die "mechanism '$mechanism' requested by backup provider is not supported for LXCs\n";
> +	}
> +    };
> +    my $err = $@;
> +    eval { $backup_provider->restore_container_cleanup($volname, $storeid, {}); };
> +    if (my $cleanup_err = $@) {
> +	die $cleanup_err if !$err;
> +	warn $cleanup_err;
> +    }
> +    die $err if $err;
> +
> +}
> +
>  sub recover_config {
>      my ($storage_cfg, $volid, $vmid) = @_;
>  
> @@ -135,6 +202,8 @@ sub recover_config {
>  	my $scfg = PVE::Storage::storage_check_enabled($storage_cfg, $storeid);
>  	if ($scfg->{type} eq 'pbs') {
>  	    return recover_config_from_proxmox_backup($storage_cfg, $volid, $vmid);
> +	} elsif (PVE::Storage::storage_has_feature($storage_cfg, $storeid, 'backup-provider')) {
> +	    return recover_config_from_external_backup($storage_cfg, $volid, $vmid);
>  	}
>      }
>  
> @@ -209,6 +278,26 @@ sub recover_config_from_tar {
>      return wantarray ? ($conf, $mp_param) : $conf;
>  }
>  
> +sub recover_config_from_external_backup {
> +    my ($storage_cfg, $volid, $vmid) = @_;
> +
> +    $vmid //= 0;
> +
> +    my $raw = PVE::Storage::extract_vzdump_config($storage_cfg, $volid);
> +
> +    my $conf = PVE::LXC::Config::parse_pct_config("/lxc/${vmid}.conf" , $raw);
> +
> +    delete $conf->{snapshots};
> +
> +    my $mp_param = {};
> +    PVE::LXC::Config->foreach_volume($conf, sub {
> +	my ($ms, $mountpoint) = @_;
> +	$mp_param->{$ms} = $conf->{$ms};
> +    });
> +
> +    return wantarray ? ($conf, $mp_param) : $conf;
> +}
> +
>  sub restore_configuration {
>      my ($vmid, $storage_cfg, $archive, $rootdir, $conf, $restricted, $unique, $skip_fw) = @_;
>  
> @@ -218,6 +307,26 @@ sub restore_configuration {
>  	if ($scfg->{type} eq 'pbs') {
>  	    return restore_configuration_from_proxmox_backup($vmid, $storage_cfg, $archive, $rootdir, $conf, $restricted, $unique, $skip_fw);
>  	}
> +	if (PVE::Storage::storage_has_feature($storage_cfg, $storeid, 'backup-provider')) {
> +	    my $log_function = sub {
> +		my ($log_level, $message) = @_;
> +		my $prefix = $log_level eq 'err' ? 'ERROR' : uc($log_level);
> +		print "$prefix: $message\n";
> +	    };
> +	    my $backup_provider =
> +		PVE::Storage::new_backup_provider($storage_cfg, $storeid, $log_function);
> +	    return restore_configuration_from_external_backup(
> +		$backup_provider,
> +		$vmid,
> +		$storage_cfg,
> +		$archive,
> +		$rootdir,
> +		$conf,
> +		$restricted,
> +		$unique,
> +		$skip_fw,
> +	    );
> +	}
>      }
>      restore_configuration_from_etc_vzdump($vmid, $rootdir, $conf, $restricted, $unique, $skip_fw);
>  }
> @@ -258,6 +367,38 @@ sub restore_configuration_from_proxmox_backup {
>      }
>  }
>  
> +sub restore_configuration_from_external_backup {
> +    my ($backup_provider, $vmid, $storage_cfg, $archive, $rootdir, $conf, $restricted, $unique, $skip_fw) = @_;
> +
> +    my ($storeid, $volname) = PVE::Storage::parse_volume_id($archive);
> +    my $scfg = PVE::Storage::storage_config($storage_cfg, $storeid);
> +
> +    my ($vtype, $name, undef, undef, undef, undef, $format) =
> +	PVE::Storage::parse_volname($storage_cfg, $archive);
> +
> +    my $oldconf = recover_config_from_external_backup($storage_cfg, $archive, $vmid);
> +
> +    sanitize_and_merge_config($conf, $oldconf, $restricted, $unique);
> +
> +    my $firewall_config =
> +	$backup_provider->restore_get_firewall_config($volname, $storeid);
> +
> +    if ($firewall_config) {
> +	my $pve_firewall_dir = '/etc/pve/firewall';
> +	my $pct_fwcfg_target = "${pve_firewall_dir}/${vmid}.fw";
> +	if ($skip_fw) {
> +	    warn "ignoring firewall config from backup archive, lacking API permission to modify firewall.\n";
> +	    warn "old firewall configuration in '$pct_fwcfg_target' left in place!\n"
> +		if -e $pct_fwcfg_target;
> +	} else {
> +	    mkdir $pve_firewall_dir; # make sure the directory exists
> +	    PVE::Tools::file_set_contents($pct_fwcfg_target, $firewall_config);
> +	}
> +    }
> +
> +    return;
> +}
> +
>  sub sanitize_and_merge_config {
>      my ($conf, $oldconf, $restricted, $unique) = @_;
>  
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC container v3 27/34] create: factor out tar restore command helper
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 27/34] create: factor out tar restore command helper Fiona Ebner
@ 2024-11-12 16:28   ` Fabian Grünbichler
  2024-11-12 17:08   ` [pve-devel] applied: " Thomas Lamprecht
  1 sibling, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12 16:28 UTC (permalink / raw)
  To: Proxmox VE development discussion

Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

IMHO this would also be a candidate for applying now - but held off
because of the RFC prefix ;)

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> In preparation to re-use it for restore from backup providers.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> New in v3.
> 
>  src/PVE/LXC/Create.pm | 42 +++++++++++++++++++++++++-----------------
>  1 file changed, 25 insertions(+), 17 deletions(-)
> 
> diff --git a/src/PVE/LXC/Create.pm b/src/PVE/LXC/Create.pm
> index 7c5bf0a..8c8cb9a 100644
> --- a/src/PVE/LXC/Create.pm
> +++ b/src/PVE/LXC/Create.pm
> @@ -59,12 +59,34 @@ sub restore_proxmox_backup_archive {
>  	$scfg, $storeid, $cmd, $param, userns_cmd => $userns_cmd);
>  }
>  
> -sub restore_tar_archive {
> -    my ($archive, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_;
> +my sub restore_tar_archive_command {
> +    my ($conf, $opts, $rootdir, $bwlimit) = @_;
>  
>      my ($id_map, $root_uid, $root_gid) = PVE::LXC::parse_id_maps($conf);
>      my $userns_cmd = PVE::LXC::userns_command($id_map);
>  
> +    my $cmd = [@$userns_cmd, 'tar', 'xpf', '-', $opts->@*, '--totals',
> +               @PVE::Storage::Plugin::COMMON_TAR_FLAGS,
> +               '-C', $rootdir];
> +
> +    # skip-old-files doesn't have anything to do with time (old/new), but is
> +    # simply -k (annoyingly also called --keep-old-files) without the 'treat
> +    # existing files as errors' part... iow. it's bsdtar's interpretation of -k
> +    # *sigh*, gnu...
> +    push @$cmd, '--skip-old-files';
> +    push @$cmd, '--anchored';
> +    push @$cmd, '--exclude' , './dev/*';
> +
> +    if (defined($bwlimit)) {
> +	$cmd = [ ['cstream', '-t', $bwlimit*1024], $cmd ];
> +    }
> +
> +    return $cmd;
> +}
> +
> +sub restore_tar_archive {
> +    my ($archive, $rootdir, $conf, $no_unpack_error, $bwlimit) = @_;
> +
>      my $archive_fh;
>      my $tar_input = '<&STDIN';
>      my @compression_opt;
> @@ -92,21 +114,7 @@ sub restore_tar_archive {
>  	$tar_input = '<&'.fileno($archive_fh);
>      }
>  
> -    my $cmd = [@$userns_cmd, 'tar', 'xpf', '-', @compression_opt, '--totals',
> -               @PVE::Storage::Plugin::COMMON_TAR_FLAGS,
> -               '-C', $rootdir];
> -
> -    # skip-old-files doesn't have anything to do with time (old/new), but is
> -    # simply -k (annoyingly also called --keep-old-files) without the 'treat
> -    # existing files as errors' part... iow. it's bsdtar's interpretation of -k
> -    # *sigh*, gnu...
> -    push @$cmd, '--skip-old-files';
> -    push @$cmd, '--anchored';
> -    push @$cmd, '--exclude' , './dev/*';
> -
> -    if (defined($bwlimit)) {
> -	$cmd = [ ['cstream', '-t', $bwlimit*1024], $cmd ];
> -    }
> +    my $cmd = restore_tar_archive_command($conf, [@compression_opt], $rootdir, $bwlimit);
>  
>      if ($archive eq '-') {
>  	print "extracting archive from STDIN\n";
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state
  2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state Fiona Ebner
@ 2024-11-12 16:46   ` Fabian Grünbichler
  2024-11-13  9:22     ` Fiona Ebner
  0 siblings, 1 reply; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-12 16:46 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> In preparation for allowing multiple backup providers. Each backup
> target can then have its own dirty bitmap and there can be additional
> checks that the current backup state is actually associated to the
> expected target.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> No changes in v3.
> 
>  pve-backup.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/pve-backup.c b/pve-backup.c
> index d931746453..e8031bb89c 100644
> --- a/pve-backup.c
> +++ b/pve-backup.c
> @@ -70,6 +70,7 @@ static struct PVEBackupState {
>      JobTxn *txn;
>      CoMutex backup_mutex;
>      CoMutex dump_callback_mutex;
> +    char *target_id;
>  } backup_state;
>  
>  static void pvebackup_init(void)
> @@ -848,7 +849,7 @@ UuidInfo coroutine_fn *qmp_backup(
>  
>      if (backup_state.di_list) {
>          error_set(errp, ERROR_CLASS_GENERIC_ERROR,
> -                  "previous backup not finished");
> +                  "previous backup by provider '%s' not finished", backup_state.target_id);
>          qemu_co_mutex_unlock(&backup_state.backup_mutex);
>          return NULL;
>      }
> @@ -1100,6 +1101,11 @@ UuidInfo coroutine_fn *qmp_backup(
>      backup_state.vmaw = vmaw;
>      backup_state.pbs = pbs;
>  
> +    if (backup_state.target_id) {
> +        g_free(backup_state.target_id);
> +    }
> +    backup_state.target_id = g_strdup("Proxmox");

if we take this opportunity to also support multiple PBS targets while
we are at it, it might make sense to make this more of a "legacy" value?
or not set it at all here to opt into the legacy behaviour?

> +
>      backup_state.di_list = di_list;
>  
>      uuid_info = g_malloc0(sizeof(*uuid_info));
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* [pve-devel] applied: [RFC container v3 27/34] create: factor out tar restore command helper
  2024-11-07 16:51 ` [pve-devel] [RFC container v3 27/34] create: factor out tar restore command helper Fiona Ebner
  2024-11-12 16:28   ` Fabian Grünbichler
@ 2024-11-12 17:08   ` Thomas Lamprecht
  1 sibling, 0 replies; 63+ messages in thread
From: Thomas Lamprecht @ 2024-11-12 17:08 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 07.11.24 um 17:51 schrieb Fiona Ebner:
> In preparation to re-use it for restore from backup providers.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> New in v3.
> 
>  src/PVE/LXC/Create.pm | 42 +++++++++++++++++++++++++-----------------
>  1 file changed, 25 insertions(+), 17 deletions(-)
> 
>

applied this one with Fabian's T-b, seemed like a sensible clean up on
its own to me, thanks!


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state
  2024-11-12 16:46   ` Fabian Grünbichler
@ 2024-11-13  9:22     ` Fiona Ebner
  2024-11-13  9:33       ` Fiona Ebner
  2024-11-13 11:16       ` Fabian Grünbichler
  0 siblings, 2 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-13  9:22 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

On 12.11.24 5:46 PM, Fabian Grünbichler wrote:
> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>> +    backup_state.target_id = g_strdup("Proxmox");
> 
> if we take this opportunity to also support multiple PBS targets while
> we are at it, it might make sense to make this more of a "legacy" value?
> or not set it at all here to opt into the legacy behaviour?
> 

Why isn't "Proxmox" a good legacy value? When we add support for passing
in a target ID to qmp_backup(), I had in mind using "PBS-$storeid" or
something along those lines.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state
  2024-11-13  9:22     ` Fiona Ebner
@ 2024-11-13  9:33       ` Fiona Ebner
  2024-11-13 11:16       ` Fabian Grünbichler
  1 sibling, 0 replies; 63+ messages in thread
From: Fiona Ebner @ 2024-11-13  9:33 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

On 13.11.24 10:22 AM, Fiona Ebner wrote:
> On 12.11.24 5:46 PM, Fabian Grünbichler wrote:
>> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>>> +    backup_state.target_id = g_strdup("Proxmox");
>>
>> if we take this opportunity to also support multiple PBS targets while
>> we are at it, it might make sense to make this more of a "legacy" value?
>> or not set it at all here to opt into the legacy behaviour?
>>
> 
> Why isn't "Proxmox" a good legacy value? When we add support for passing
> in a target ID to qmp_backup(), I had in mind using "PBS-$storeid" or
> something along those lines.

Also, this value is used in error messages like "previous backup by
provider %s not finished", so "Proxmox" fits there too. It was
"provider_id" early on and then changed to "target_id" for the very
reason that a single provider might want multiple bitmaps for multiple
targets, so I guess I should adapt the error message to something like
"previous backup for target %s not finished".


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace
  2024-11-12 14:20   ` Fabian Grünbichler
@ 2024-11-13 10:08     ` Fiona Ebner
  2024-11-13 11:15       ` Fabian Grünbichler
  0 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-13 10:08 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fabian Grünbichler

On 12.11.24 3:20 PM, Fabian Grünbichler wrote:
> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>> The first use case is running the container backup subroutine for
>> +package PVE::Env;
> 
> I agree with Thomas that this name might be a bit too generic ;)
> 
> I also wonder - since this seems to be only used in pve-container, and
> it really mostly makes sense in that context, wouldn't it be better off
> there? or do we expect other areas where we need userns handling?
> (granted, some of the comments below would require other changes to
> pve-common anyway ;))
> 

The only other use-case I'm aware of is where I got the code from
originally, i.e. pve-builpkg. Sure, I can move it to pve-container to
start out.

>> +
>> +use strict;
>> +use warnings;
>> +
>> +use Fcntl qw(O_WRONLY);
>> +use POSIX qw(EINTR);
>> +use Socket;
>> +
>> +require qw(syscall.ph);
> 
> PVE::Syscall already does this, and has the following:
> 
> BEGIN {
>     die "syscall.ph can only be required once!\n" if $INC{'syscall.ph'};
>     require("syscall.ph");
> 
> don't those two clash? I think those syscall related parts should
> probably move there?
> 

Hm, never experienced this error, but sure, will move the relevant parts.

>> +
>> +use constant {CLONE_NEWNS   => 0x00020000,
>> +              CLONE_NEWUSER => 0x10000000};
>> +
>> +sub unshare($) {
>> +    my ($flags) = @_;
>> +    return 0 == syscall(272, $flags);
>> +}
> 
> this is PVE::Tools::unshare, maybe the latter should move here?
> 

I'll just re-use the one from Tools then when moving to pve-container.

>> +
>> +sub __set_id_map($$$) {
>> +    my ($pid, $what, $value) = @_;
>> +    sysopen(my $fd, "/proc/$pid/${what}_map", O_WRONLY)
>> +	or die "failed to open child process' ${what}_map\n";
>> +    my $rc = syswrite($fd, $value);
>> +    if (!$rc || $rc != length($value)) {
>> +	die "failed to set sub$what: $!\n";
>> +    }
>> +    close($fd);
>> +}
>> +
>> +sub set_id_map($$) {
>> +    my ($pid, $id_map) = @_;
>> +
>> +    my $gid_map = '';
>> +    my $uid_map = '';
>> +
>> +    for my $map ($id_map->@*) {
>> +	my ($type, $ct, $host, $length) = $map->@*;
>> +
>> +	$gid_map .= "$ct $host $length\n" if $type eq 'g';
>> +	$uid_map .= "$ct $host $length\n" if $type eq 'u';
>> +    }
>> +
>> +    __set_id_map($pid, 'gid', $gid_map) if $gid_map;
>> +    __set_id_map($pid, 'uid', $uid_map) if $uid_map;
>> +}
> 
> do we gain a lot here from not just using newuidmap/newgidmap?
> 

I didn't know those commands existed :P Running commands seems more
wasteful then just writing a file, but will change if you insist.

>> +sub forked(&%) {
> 
> this seems very similar to the already existing PVE::Tools::run_fork /
> run_fork_with_timeout helpers.. any reason we can't extend those with
> `afterfork` support and use them?
> 

Haven't looked into it, but will do!

>> +
>> +sub run_in_userns(&;$) {
>> +    my ($code, $id_map) = @_;
>> +    socketpair(my $sp, my $sc, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
>> +	or die "socketpair: $!\n";
>> +    forked(sub {
>> +	close($sp);
>> +	unshare(CLONE_NEWUSER|CLONE_NEWNS) or die "unshare(NEWUSER|NEWNS): $!\n";
> 
> I guess we can't set our "own" maps here for lack of capabilities and
> avoid the whole afterfork thing entirely? at least I couldn't get it to
> work ;)
> 

AFAIU, yes.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [POC storage v3 14/34] add backup provider example
  2024-11-07 16:51 ` [pve-devel] [POC storage v3 14/34] add backup provider example Fiona Ebner
@ 2024-11-13 10:52   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-13 10:52 UTC (permalink / raw)
  To: Proxmox VE development discussion

didn't give this too close a look since it's an example only, but the
hard-coded NBD indices make me wonder whether we want to have some sort
of mechanism to "reserve" NBD slots while using them, at least for *our*
usage?

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> The example uses a simple directory structure to save the backups,
> grouped by guest ID. VM backups are saved as configuration files and
> qcow2 images, with backing files when doing incremental backups.
> Container backups are saved as configuration files and a tar file or
> squashfs image (added to test the 'directory' restore mechanism).
> 
> Whether to use incremental VM backups and which backup mechanisms to
> use can be configured in the storage configuration.
> 
> The 'nbdinfo' binary from the 'libnbd-bin' package is required for
> backup mechanism 'nbd' for VM backups, the 'mksquashfs' binary from the
> 'squashfs-tools' package is required for backup mechanism 'squashfs' for
> containers.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v3:
> * adapt to API changes
> * use NBD export when restoring VM image, to make incremental backups
>   using qcow2 chains work again
> 
>  .../BackupProvider/Plugin/DirectoryExample.pm | 697 ++++++++++++++++++
>  src/PVE/BackupProvider/Plugin/Makefile        |   2 +-
>  .../Custom/BackupProviderDirExamplePlugin.pm  | 307 ++++++++
>  src/PVE/Storage/Custom/Makefile               |   5 +
>  src/PVE/Storage/Makefile                      |   1 +
>  5 files changed, 1011 insertions(+), 1 deletion(-)
>  create mode 100644 src/PVE/BackupProvider/Plugin/DirectoryExample.pm
>  create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
>  create mode 100644 src/PVE/Storage/Custom/Makefile
> 
> diff --git a/src/PVE/BackupProvider/Plugin/DirectoryExample.pm b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm
> new file mode 100644
> index 0000000..99825ef
> --- /dev/null
> +++ b/src/PVE/BackupProvider/Plugin/DirectoryExample.pm
> @@ -0,0 +1,697 @@
> +package PVE::BackupProvider::Plugin::DirectoryExample;
> +
> +use strict;
> +use warnings;
> +
> +use Fcntl qw(SEEK_SET);
> +use File::Path qw(make_path remove_tree);
> +use IO::File;
> +use IPC::Open3;
> +
> +use PVE::Storage::Plugin;
> +use PVE::Tools qw(file_get_contents file_read_firstline file_set_contents run_command);
> +
> +use base qw(PVE::BackupProvider::Plugin::Base);
> +
> +use constant {
> +    BLKDISCARD => 0x1277, # see linux/fs.h
> +};
> +
> +# Private helpers
> +
> +my sub log_info {
> +    my ($self, $message) = @_;
> +
> +    $self->{'log-function'}->('info', $message);
> +}
> +
> +my sub log_warning {
> +    my ($self, $message) = @_;
> +
> +    $self->{'log-function'}->('warn', $message);
> +}
> +
> +my sub log_error {
> +    my ($self, $message) = @_;
> +
> +    $self->{'log-function'}->('err', $message);
> +}
> +
> +# Try to use the same bitmap ID as last time for incremental backup if the storage is configured for
> +# incremental VM backup. Need to start fresh if there is no previous ID or the associated backup
> +# doesn't exist.
> +my sub get_bitmap_id {
> +    my ($self, $vmid, $vmtype) = @_;
> +
> +    return if $self->{'storage-plugin'}->get_vm_backup_mode($self->{scfg}) ne 'incremental';
> +
> +    my $previous_info_dir = "$self->{scfg}->{path}/$vmid/";
> +
> +    my $previous_info_file = "$previous_info_dir/previous-info";
> +    my $info = file_read_firstline($previous_info_file) // '';
> +    $self->{$vmid}->{'old-previous-info'} = $info;
> +    my ($bitmap_id, $previous_backup_id) = $info =~ m/^(\d+)\s+(\d+)$/;
> +    my $previous_backup_dir =
> +	$previous_backup_id ? "$self->{scfg}->{path}/$vmid/$vmtype-$previous_backup_id" : undef;

so the backup ID is an epoch - wouldn't it be nicer to use the formatted
one as subdir, rather than the epoch itself?

> +
> +    if ($bitmap_id && -d $previous_backup_dir) {
> +	$self->{$vmid}->{'previous-backup-dir'} = $previous_backup_dir;
> +    } else {
> +	# need to start fresh if there is no previous ID or the associated backup doesn't exist
> +	$bitmap_id = $self->{$vmid}->{'backup-time'};
> +    }
> +
> +    $self->{$vmid}->{'bitmap-id'} = $bitmap_id;
> +    make_path($previous_info_dir);
> +    die "unable to create directory $previous_info_dir\n" if !-d $previous_info_dir;
> +    file_set_contents($previous_info_file, "$bitmap_id $self->{$vmid}->{'backup-time'}");
> +
> +    return $bitmap_id;
> +}
> +
> +# Backup Provider API
> +
> +sub new {
> +    my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_;
> +
> +    my $self = bless {
> +	scfg => $scfg,
> +	storeid => $storeid,
> +	'storage-plugin' => $storage_plugin,
> +	'log-function' => $log_function,
> +    }, $class;
> +
> +    return $self;
> +}
> +
> +sub provider_name {
> +    my ($self) = @_;
> +
> +    return 'dir provider example';
> +}
> +
> +# Hooks
> +
> +my sub job_start {
> +    my ($self, $start_time) = @_;
> +
> +    log_info($self, "job start hook called");
> +
> +    run_command(["modprobe", "nbd"]);

this duplicates the modprobe in qemu-server, but without the parameter..

> +
> +    log_info($self, "backup provider initialized successfully for new job $start_time");
> +}
> +
> +sub job_hook {
> +    my ($self, $phase, $info) = @_;
> +
> +    if ($phase eq 'start') {
> +	job_start($self, $info->{'start-time'});
> +    } elsif ($phase eq 'end') {
> +	log_info($self, "job end hook called");
> +    } elsif ($phase eq 'abort') {
> +	log_info($self, "job abort hook called with error - $info->{error}");
> +    }
> +
> +    # ignore unknown phase
> +
> +    return;
> +}
> +
> +my sub backup_start {
> +    my ($self, $vmid, $vmtype, $backup_time) = @_;
> +
> +    log_info($self, "backup start hook called");
> +
> +    my $backup_dir = $self->{scfg}->{path} . "/" . $self->{$vmid}->{archive};
> +
> +    make_path($backup_dir);
> +    die "unable to create directory $backup_dir\n" if !-d $backup_dir;
> +
> +    $self->{$vmid}->{'backup-time'} = $backup_time;
> +    $self->{$vmid}->{'backup-dir'} = $backup_dir;
> +    $self->{$vmid}->{'task-size'} = 0;
> +}
> +
> +my sub backup_abort {
> +    my ($self, $vmid, $error) = @_;
> +
> +    log_info($self, "backup abort hook called");
> +
> +    $self->{$vmid}->{failed} = 1;
> +
> +
> +    if (my $dir = $self->{$vmid}->{'backup-dir'}) {
> +	eval { remove_tree($dir) };
> +	$self->{'log-warning'}->("unable to clean up $dir - $@") if $@;
> +    }
> +
> +    # Restore old previous-info so next attempt can re-use bitmap again
> +    if (my $info = $self->{$vmid}->{'old-previous-info'}) {
> +	my $previous_info_dir = "$self->{scfg}->{path}/$vmid/";
> +	my $previous_info_file = "$previous_info_dir/previous-info";
> +	file_set_contents($previous_info_file, $info);
> +    }
> +}
> +
> +sub backup_hook {
> +    my ($self, $phase, $vmid, $vmtype, $info) = @_;
> +
> +    if ($phase eq 'start') {
> +	backup_start($self, $vmid, $vmtype, $info->{'start-time'});
> +    } elsif ($phase eq 'end') {
> +	log_info($self, "backup end hook called");
> +    } elsif ($phase eq 'abort') {
> +	backup_abort($self, $vmid, $info->{error});
> +    } elsif ($phase eq 'prepare') {
> +	my $dir = $self->{$vmid}->{'backup-dir'};
> +	chown($info->{'backup-user-id'}, -1, $dir)
> +	    or die "unable to change owner for $dir\n";
> +    }
> +
> +    # ignore unknown phase
> +
> +    return;
> +}
> +
> +sub backup_get_mechanism {
> +    my ($self, $vmid, $vmtype) = @_;
> +
> +    return ('directory', undef) if $vmtype eq 'lxc';
> +
> +    if ($vmtype eq 'qemu') {
> +	my $backup_mechanism = $self->{'storage-plugin'}->get_vm_backup_mechanism($self->{scfg});
> +	return ($backup_mechanism, get_bitmap_id($self, $vmid, $vmtype));
> +    }
> +
> +    die "unsupported guest type '$vmtype'\n";
> +}
> +
> +sub backup_get_archive_name {
> +    my ($self, $vmid, $vmtype, $backup_time) = @_;
> +
> +    return $self->{$vmid}->{archive} = "${vmid}/${vmtype}-${backup_time}";

same question here w.r.t. epoch vs RFC3339

> +}
> +
> +sub backup_get_task_size {
> +    my ($self, $vmid) = @_;
> +
> +    return $self->{$vmid}->{'task-size'};
> +}
> +
> +sub backup_handle_log_file {
> +    my ($self, $vmid, $filename) = @_;
> +
> +    my $log_dir = $self->{$vmid}->{'backup-dir'};
> +    if ($self->{$vmid}->{failed}) {
> +	$log_dir .= ".failed";
> +    }
> +    make_path($log_dir);
> +    die "unable to create directory $log_dir\n" if !-d $log_dir;
> +
> +    my $data = file_get_contents($filename);
> +    my $target = "${log_dir}/backup.log";
> +    file_set_contents($target, $data);
> +}
> +
> +my sub backup_block_device {
> +    my ($self, $vmid, $devicename, $size, $path, $bitmap_mode, $next_dirty_region, $bandwidth_limit) = @_;
> +
> +    # TODO honor bandwidth_limit
> +
> +    my $previous_backup_dir = $self->{$vmid}->{'previous-backup-dir'};
> +    my $incremental = $previous_backup_dir && $bitmap_mode eq 'reuse';
> +    my $target = "$self->{$vmid}->{'backup-dir'}/${devicename}.qcow2";
> +    my $target_base = $incremental ? "${previous_backup_dir}/${devicename}.qcow2" : undef;
> +    my $create_cmd = ["qemu-img", "create", "-f", "qcow2", $target, $size];
> +    push $create_cmd->@*, "-b", $target_base, "-F", "qcow2" if $target_base;
> +    run_command($create_cmd);
> +
> +    eval {
> +	# allows to easily write to qcow2 target
> +	run_command(["qemu-nbd", "-c", "/dev/nbd15", $target, "--format=qcow2"]);

doesn't this (potentially) clash with other NBD usage?

> +
> +	my $block_size = 4 * 1024 * 1024; # 4 MiB
> +
> +	my $in_fh = IO::File->new($path, "r+")
> +	    or die "unable to open NBD backup source - $!\n";
> +	my $out_fh = IO::File->new("/dev/nbd15", "r+")
> +	    or die "unable to open NBD backup target - $!\n";
> +
> +	my $buffer = '';
> +
> +	while (scalar((my $region_offset, my $region_length) = $next_dirty_region->())) {
> +	    sysseek($in_fh, $region_offset, SEEK_SET)
> +		// die "unable to seek '$region_offset' in NBD backup source - $!";
> +	    sysseek($out_fh, $region_offset, SEEK_SET)
> +		// die "unable to seek '$region_offset' in NBD backup target - $!";
> +
> +	    my $local_offset = 0; # within the region
> +	    while ($local_offset < $region_length) {
> +		my $remaining = $region_length - $local_offset;
> +		my $request_size = $remaining < $block_size ? $remaining : $block_size;
> +		my $offset = $region_offset + $local_offset;
> +
> +		my $read = sysread($in_fh, $buffer, $request_size);
> +
> +		die "failed to read from backup source - $!\n" if !defined($read);
> +		die "premature EOF while reading backup source\n" if $read == 0;
> +
> +		my $written = 0;
> +		while ($written < $read) {
> +		    my $res = syswrite($out_fh, $buffer, $request_size - $written, $written);
> +		    die "failed to write to backup target - $!\n" if !defined($res);
> +		    die "unable to progress writing to backup target\n" if $res == 0;
> +		    $written += $res;
> +		}
> +
> +		ioctl($in_fh, BLKDISCARD, pack('QQ', int($offset), int($request_size)));
> +
> +		$local_offset += $request_size;
> +	    }
> +	}
> +    };
> +    my $err = $@;
> +
> +    eval { run_command(["qemu-nbd", "-d", "/dev/nbd15" ]); };
> +    $self->{'log-warning'}->("unable to disconnect NBD backup target - $@") if $@;
> +
> +    die $err if $err;
> +}
> +
> +my sub backup_nbd {
> +    my ($self, $vmid, $devicename, $size, $nbd_path, $bitmap_mode, $bitmap_name, $bandwidth_limit) = @_;
> +
> +    # TODO honor bandwidth_limit
> +
> +    die "need 'nbdinfo' binary from package libnbd-bin\n" if !-e "/usr/bin/nbdinfo";
> +
> +    my $nbd_info_uri = "nbd+unix:///${devicename}?socket=${nbd_path}";
> +    my $qemu_nbd_uri = "nbd:unix:${nbd_path}:exportname=${devicename}";
> +
> +    my $cpid;
> +    my $error_fh;
> +    my $next_dirty_region;
> +
> +    # If there is no dirty bitmap, it can be treated as if there's a full dirty one. The output of
> +    # nbdinfo is a list of tuples with offset, length, type, description. The first bit of 'type' is
> +    # set when the bitmap is dirty, see QEMU's docs/interop/nbd.txt
> +    my $dirty_bitmap = [];
> +    if ($bitmap_mode ne 'none') {
> +	my $input = IO::File->new();
> +	my $info = IO::File->new();
> +	$error_fh = IO::File->new();
> +	my $nbdinfo_cmd = ["nbdinfo", $nbd_info_uri, "--map=qemu:dirty-bitmap:${bitmap_name}"];
> +	$cpid = open3($input, $info, $error_fh, $nbdinfo_cmd->@*)
> +	    or die "failed to spawn nbdinfo child - $!\n";
> +
> +	$next_dirty_region = sub {
> +	    my ($offset, $length, $type);
> +	    do {
> +		my $line = <$info>;
> +		return if !$line;
> +		die "unexpected output from nbdinfo - $line\n"
> +		    if $line !~ m/^\s*(\d+)\s*(\d+)\s*(\d+)/; # also untaints
> +		($offset, $length, $type) = ($1, $2, $3);
> +	    } while (($type & 0x1) == 0); # not dirty
> +	    return ($offset, $length);
> +	};
> +    } else {
> +	my $done = 0;
> +	$next_dirty_region = sub {
> +	    return if $done;
> +	    $done = 1;
> +	    return (0, $size);
> +	};
> +    }
> +
> +    eval {
> +	run_command(["qemu-nbd", "-c", "/dev/nbd0", $qemu_nbd_uri, "--format=raw", "--discard=on"]);

same question here (but with a different hard-coded index ;))

> +
> +	backup_block_device(
> +	    $self,
> +	    $vmid,
> +	    $devicename,
> +	    $size,
> +	    '/dev/nbd0',
> +	    $bitmap_mode,
> +	    $next_dirty_region,
> +	    $bandwidth_limit,
> +	);
> +    };
> +    my $err = $@;
> +
> +    eval { run_command(["qemu-nbd", "-d", "/dev/nbd0" ]); };
> +    $self->{'log-warning'}->("unable to disconnect NBD backup source - $@") if $@;
> +
> +    if ($cpid) {
> +	my $waited;
> +	my $wait_limit = 5;
> +	for ($waited = 0; $waited < $wait_limit && waitpid($cpid, POSIX::WNOHANG) == 0; $waited++) {
> +	    kill 15, $cpid if $waited == 0;
> +	    sleep 1;
> +	}
> +	if ($waited == $wait_limit) {
> +	    kill 9, $cpid;
> +	    sleep 1;
> +	    $self->{'log-warning'}->("unable to collect nbdinfo child process")
> +		if waitpid($cpid, POSIX::WNOHANG) == 0;
> +	}
> +    }
> +
> +    die $err if $err;
> +}
> +
> +my sub backup_vm_volume {
> +    my ($self, $vmid, $devicename, $info, $bandwidth_limit) = @_;
> +
> +    my $backup_mechanism = $self->{'storage-plugin'}->get_vm_backup_mechanism($self->{scfg});
> +
> +    if ($backup_mechanism eq 'nbd') {
> +	backup_nbd(
> +	    $self,
> +	    $vmid,
> +	    $devicename,
> +	    $info->{size},
> +	    $info->{'nbd-path'},
> +	    $info->{'bitmap-mode'},
> +	    $info->{'bitmap-name'},
> +	    $bandwidth_limit,
> +	);
> +    } elsif ($backup_mechanism eq 'block-device') {
> +	backup_block_device(
> +	    $self,
> +	    $vmid,
> +	    $devicename,
> +	    $info->{size},
> +	    $info->{path},
> +	    $info->{'bitmap-mode'},
> +	    $info->{'next-dirty-region'},
> +	    $bandwidth_limit,
> +	);
> +    } else {
> +	die "internal error - unknown VM backup mechansim '$backup_mechanism'\n";
> +    }
> +}
> +
> +sub backup_vm {
> +    my ($self, $vmid, $guest_config, $volumes, $info) = @_;
> +
> +    my $target = "$self->{$vmid}->{'backup-dir'}/guest.conf";
> +    file_set_contents($target, $guest_config);
> +
> +    $self->{$vmid}->{'task-size'} += -s $target;
> +
> +    if (my $firewall_config = $info->{'firewall-config'}) {
> +	$target = "$self->{$vmid}->{'backup-dir'}/firewall.conf";
> +	file_set_contents($target, $firewall_config);
> +
> +	$self->{$vmid}->{'task-size'} += -s $target;
> +    }
> +
> +    for my $devicename (sort keys $volumes->%*) {
> +	backup_vm_volume(
> +	    $self, $vmid, $devicename, $volumes->{$devicename}, $info->{'bandwidth-limit'});
> +    }
> +}
> +
> +my sub backup_directory_tar {
> +    my ($self, $vmid, $directory, $exclude_patterns, $sources, $bandwidth_limit) = @_;
> +
> +    # essentially copied from PVE/VZDump/LXC.pm' archive()
> +
> +    # copied from PVE::Storage::Plugin::COMMON_TAR_FLAGS
> +    my @tar_flags = qw(
> +	--one-file-system
> +	-p --sparse --numeric-owner --acls
> +	--xattrs --xattrs-include=user.* --xattrs-include=security.capability
> +	--warning=no-file-ignored --warning=no-xattr-write
> +    );
> +
> +    my $tar = ['tar', 'cpf', '-', '--totals', @tar_flags];
> +
> +    push @$tar, "--directory=$directory";
> +
> +    my @exclude_no_anchored = ();
> +    my @exclude_anchored = ();
> +    for my $pattern ($exclude_patterns->@*) {
> +	if ($pattern !~ m|^/|) {
> +	    push @exclude_no_anchored, $pattern;
> +	} else {
> +	    push @exclude_anchored, $pattern;
> +	}
> +    }
> +
> +    push @$tar, '--no-anchored';
> +    push @$tar, '--exclude=lost+found';
> +    push @$tar, map { "--exclude=$_" } @exclude_no_anchored;
> +
> +    push @$tar, '--anchored';
> +    push @$tar, map { "--exclude=.$_" } @exclude_anchored;
> +
> +    push @$tar, $sources->@*;
> +
> +    my $cmd = [ $tar ];
> +
> +    push @$cmd, [ 'cstream', '-t', $bandwidth_limit * 1024 ] if $bandwidth_limit;
> +
> +    my $target = "$self->{$vmid}->{'backup-dir'}/archive.tar";
> +    push @{$cmd->[-1]}, \(">" . PVE::Tools::shellquote($target));
> +
> +    my $logfunc = sub {
> +	my $line = shift;
> +	log_info($self, "tar: $line");
> +    };
> +
> +    PVE::Tools::run_command($cmd, logfunc => $logfunc);
> +
> +    return;
> +};
> +
> +# NOTE This only serves as an example to illustrate the 'directory' restore mechanism. It is not
> +# fleshed out properly, e.g. I didn't check if exclusion is compatible with
> +# proxmox-backup-client/rsync or xattrs/ACL/etc. work as expected!
> +my sub backup_directory_squashfs {
> +    my ($self, $vmid, $directory, $exclude_patterns, $bandwidth_limit) = @_;
> +
> +    my $target = "$self->{$vmid}->{'backup-dir'}/archive.sqfs";
> +
> +    my $mksquashfs = ['mksquashfs', $directory, $target, '-quiet', '-no-progress'];
> +
> +    push $mksquashfs->@*, '-wildcards';
> +
> +    for my $pattern ($exclude_patterns->@*) {
> +	if ($pattern !~ m|^/|) { # non-anchored
> +	    push $mksquashfs->@*, '-e', "... $pattern";
> +	} else { # anchored
> +	    push $mksquashfs->@*, '-e', substr($pattern, 1); # need to strip leading slash
> +	}
> +    }
> +
> +    my $cmd = [ $mksquashfs ];
> +
> +    push @$cmd, [ 'cstream', '-t', $bandwidth_limit * 1024 ] if $bandwidth_limit;
> +
> +    my $logfunc = sub {
> +	my $line = shift;
> +	log_info($self, "mksquashfs: $line");
> +    };
> +
> +    PVE::Tools::run_command($cmd, logfunc => $logfunc);
> +
> +    return;
> +};
> +
> +sub backup_container {
> +    my ($self, $vmid, $guest_config, $exclude_patterns, $info) = @_;
> +
> +    my $target = "$self->{$vmid}->{'backup-dir'}/guest.conf";
> +    file_set_contents($target, $guest_config);
> +
> +    $self->{$vmid}->{'task-size'} += -s $target;
> +
> +    if (my $firewall_config = $info->{'firewall-config'}) {
> +	$target = "$self->{$vmid}->{'backup-dir'}/firewall.conf";
> +	file_set_contents($target, $firewall_config);
> +
> +	$self->{$vmid}->{'task-size'} += -s $target;
> +    }
> +
> +    my $backup_mode = $self->{'storage-plugin'}->get_lxc_backup_mode($self->{scfg});
> +    if ($backup_mode eq 'tar') {
> +	backup_directory_tar(
> +	    $self,
> +	    $vmid,
> +	    $info->{directory},
> +	    $exclude_patterns,
> +	    $info->{sources},
> +	    $info->{'bandwidth-limit'},
> +	);
> +    } elsif ($backup_mode eq 'squashfs') {
> +	backup_directory_squashfs(
> +	    $self,
> +	    $vmid,
> +	    $info->{directory},
> +	    $exclude_patterns,
> +	    $info->{'bandwidth-limit'},
> +	);
> +    } else {
> +	die "got unexpected backup mode '$backup_mode' from storage plugin\n";
> +    }
> +}
> +
> +# Restore API
> +
> +sub restore_get_mechanism {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my ($vmtype) = $relative_backup_dir =~ m!^\d+/([a-z]+)-!;
> +
> +    return ('qemu-img', $vmtype) if $vmtype eq 'qemu';
> +
> +    if ($vmtype eq 'lxc') {
> +	my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
> +
> +	if (-e "$self->{scfg}->{path}/${relative_backup_dir}/archive.tar") {
> +	    $self->{'restore-mechanisms'}->{$volname} = 'tar';
> +	    return ('tar', $vmtype);
> +	}
> +
> +	if (-e "$self->{scfg}->{path}/${relative_backup_dir}/archive.sqfs") {
> +	    $self->{'restore-mechanisms'}->{$volname} = 'directory';
> +	    return ('directory', $vmtype)
> +	}
> +
> +	die "unable to find archive '$volname'\n";
> +    }
> +
> +    die "cannot restore unexpected guest type '$vmtype'\n";
> +}
> +
> +sub restore_get_guest_config {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $filename = "$self->{scfg}->{path}/${relative_backup_dir}/guest.conf";
> +
> +    return file_get_contents($filename);
> +}
> +
> +sub restore_get_firewall_config {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $filename = "$self->{scfg}->{path}/${relative_backup_dir}/firewall.conf";
> +
> +    return if !-e $filename;
> +
> +    return file_get_contents($filename);
> +}
> +
> +sub restore_vm_init {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my $res = {};
> +
> +    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $backup_dir = "$self->{scfg}->{path}/${relative_backup_dir}";
> +
> +    my @backup_files = glob("$backup_dir/*");
> +    for my $backup_file (@backup_files) {
> +	next if $backup_file !~ m!^(.*/(.*)\.qcow2)$!;
> +	$backup_file = $1; # untaint
> +	$res->{$2}->{size} = PVE::Storage::Plugin::file_size_info($backup_file);
> +    }
> +
> +    return $res;
> +}
> +
> +sub restore_vm_cleanup {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    return; # nothing to do
> +}
> +
> +sub restore_vm_volume_init {
> +    my ($self, $volname, $storeid, $devicename, $info) = @_;
> +
> +    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $image = "$self->{scfg}->{path}/${relative_backup_dir}/${devicename}.qcow2";
> +    # NOTE Backing files are not allowed by Proxmox VE when restoring. The reason is that an
> +    # untrusted qcow2 image can specify an arbitrary backing file and thus leak data from the host.
> +    # For the sake of the directory example plugin, an NBD export is created, but this side-steps
> +    # the check and would allow the attack again. An actual implementation should check that the
> +    # backing file (or rather, the whole backing chain) is safe first!
> +    PVE::Tools::run_command(['qemu-nbd', '-c', '/dev/nbd7', $image]);

and another hard-coded index here - I really think we need some sort of
solution for this..

> +    return {
> +	'qemu-img-path' => '/dev/nbd7',
> +    };
> +}
> +
> +sub restore_vm_volume_cleanup {
> +    my ($self, $volname, $storeid, $devicename, $info) = @_;
> +
> +    PVE::Tools::run_command(['qemu-nbd', '-d', '/dev/nbd7']);
> +
> +    return;
> +}
> +
> +my sub restore_tar_init {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $relative_backup_dir) = $self->{'storage-plugin'}->parse_volname($volname);
> +    return { 'tar-path' => "$self->{scfg}->{path}/${relative_backup_dir}/archive.tar" };
> +}
> +
> +my sub restore_directory_init {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $relative_backup_dir, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $archive = "$self->{scfg}->{path}/${relative_backup_dir}/archive.sqfs";
> +
> +    my $mount_point = "/run/backup-provider-example/${vmid}.mount";
> +    make_path($mount_point);
> +    die "unable to create directory $mount_point\n" if !-d $mount_point;
> +
> +    run_command(['mount', '-o', 'ro', $archive, $mount_point]);
> +
> +    return { 'archive-directory' => $mount_point };
> +}
> +
> +my sub restore_directory_cleanup {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, undef, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $mount_point = "/run/backup-provider-example/${vmid}.mount";
> +
> +    run_command(['umount', $mount_point]);
> +
> +    return;
> +}
> +
> +sub restore_container_init {
> +    my ($self, $volname, $storeid, $info) = @_;
> +
> +    if ($self->{'restore-mechanisms'}->{$volname} eq 'tar') {
> +	return restore_tar_init($self, $volname, $storeid);
> +    } elsif ($self->{'restore-mechanisms'}->{$volname} eq 'directory') {
> +	return restore_directory_init($self, $volname, $storeid);
> +    } else {
> +	die "no restore mechanism set for '$volname'\n";
> +    }
> +}
> +
> +sub restore_container_cleanup {
> +    my ($self, $volname, $storeid, $info) = @_;
> +
> +    if ($self->{'restore-mechanisms'}->{$volname} eq 'tar') {
> +	return; # nothing to do
> +    } elsif ($self->{'restore-mechanisms'}->{$volname} eq 'directory') {
> +	return restore_directory_cleanup($self, $volname, $storeid);
> +    } else {
> +	die "no restore mechanism set for '$volname'\n";
> +    }
> +}
> +
> +1;
> diff --git a/src/PVE/BackupProvider/Plugin/Makefile b/src/PVE/BackupProvider/Plugin/Makefile
> index bbd7431..bedc26e 100644
> --- a/src/PVE/BackupProvider/Plugin/Makefile
> +++ b/src/PVE/BackupProvider/Plugin/Makefile
> @@ -1,4 +1,4 @@
> -SOURCES = Base.pm
> +SOURCES = Base.pm DirectoryExample.pm
>  
>  .PHONY: install
>  install:
> diff --git a/src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm b/src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
> new file mode 100644
> index 0000000..5152923
> --- /dev/null
> +++ b/src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
> @@ -0,0 +1,307 @@
> +package PVE::Storage::Custom::BackupProviderDirExamplePlugin;
> +
> +use strict;
> +use warnings;
> +
> +use File::Basename qw(basename);
> +
> +use PVE::BackupProvider::Plugin::DirectoryExample;
> +use PVE::Tools;
> +
> +use base qw(PVE::Storage::Plugin);
> +
> +# Helpers
> +
> +sub get_vm_backup_mechanism {
> +    my ($class, $scfg) = @_;
> +
> +    return $scfg->{'vm-backup-mechanism'} // properties()->{'vm-backup-mechanism'}->{'default'};
> +}
> +
> +sub get_vm_backup_mode {
> +    my ($class, $scfg) = @_;
> +
> +    return $scfg->{'vm-backup-mode'} // properties()->{'vm-backup-mode'}->{'default'};
> +}
> +
> +sub get_lxc_backup_mode {
> +    my ($class, $scfg) = @_;
> +
> +    return $scfg->{'lxc-backup-mode'} // properties()->{'lxc-backup-mode'}->{'default'};
> +}
> +
> +# Configuration
> +
> +sub api {
> +    return 11;
> +}
> +
> +sub type {
> +    return 'backup-provider-dir-example';
> +}
> +
> +sub plugindata {
> +    return {
> +	content => [ { backup => 1, none => 1 }, { backup => 1 } ],
> +	features => { 'backup-provider' => 1 },
> +    };
> +}
> +
> +sub properties {
> +    return {
> +	'lxc-backup-mode' => {
> +	    description => "How to create LXC backups. tar - create a tar archive."
> +		." squashfs - create a squashfs image. Requires squashfs-tools to be installed.",
> +	    type => 'string',
> +	    enum => [qw(tar squashfs)],
> +	    default => 'tar',
> +	},
> +	'vm-backup-mechanism' => {
> +	    description => "Which mechanism to use for creating VM backups. nbd - access data via "
> +		." NBD export. block-device - access data via regular block device.",
> +	    type => 'string',
> +	    enum => [qw(nbd block-device)],
> +	    default => 'block-device',
> +	},
> +	'vm-backup-mode' => {
> +	    description => "How to create VM backups. full - always create full backups."
> +		." incremental - create incremental backups when possible, fallback to full when"
> +		." necessary, e.g. VM disk's bitmap is invalid.",
> +	    type => 'string',
> +	    enum => [qw(full incremental)],
> +	    default => 'full',
> +	},
> +    };
> +}
> +
> +sub options {
> +    return {
> +	path => { fixed => 1 },
> +	'lxc-backup-mode' => { optional => 1 },
> +	'vm-backup-mechanism' => { optional => 1 },
> +	'vm-backup-mode' => { optional => 1 },
> +	disable => { optional => 1 },
> +	nodes => { optional => 1 },
> +	'prune-backups' => { optional => 1 },
> +	'max-protected-backups' => { optional => 1 },
> +    };
> +}
> +
> +# Storage implementation
> +
> +# NOTE a proper backup storage should implement this
> +sub prune_backups {
> +    my ($class, $scfg, $storeid, $keep, $vmid, $type, $dryrun, $logfunc) = @_;
> +
> +    die "not implemented";
> +}
> +
> +sub parse_volname {
> +    my ($class, $volname) = @_;
> +
> +    if ($volname =~ m!^backup/((\d+)/[a-z]+-\d+)$!) {
> +	my ($filename, $vmid) = ($1, $2);
> +	return ('backup', $filename, $vmid);
> +    }
> +
> +    die "unable to parse volume name '$volname'\n";
> +}
> +
> +sub path {
> +    my ($class, $scfg, $volname, $storeid, $snapname) = @_;
> +
> +    die "volume snapshot is not possible on backup-provider-dir-example volume" if $snapname;
> +
> +    my ($type, $filename, $vmid) = $class->parse_volname($volname);
> +
> +    return ("$scfg->{path}/${filename}", $vmid, $type);
> +}
> +
> +sub create_base {
> +    my ($class, $storeid, $scfg, $volname) = @_;
> +
> +    die "cannot create base image in backup-provider-dir-example storage\n";
> +}
> +
> +sub clone_image {
> +    my ($class, $scfg, $storeid, $volname, $vmid, $snap) = @_;
> +
> +    die "can't clone images in backup-provider-dir-example storage\n";
> +}
> +
> +sub alloc_image {
> +    my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
> +
> +    die "can't allocate space in backup-provider-dir-example storage\n";
> +}
> +
> +# NOTE a proper backup storage should implement this
> +sub free_image {
> +    my ($class, $storeid, $scfg, $volname, $isBase) = @_;
> +
> +    # if it's a backing file, it would need to be merged into the upper image first.
> +
> +    die "not implemented";
> +}
> +
> +sub list_images {
> +    my ($class, $storeid, $scfg, $vmid, $vollist, $cache) = @_;
> +
> +    my $res = [];
> +
> +    return $res;
> +}
> +
> +sub list_volumes {
> +    my ($class, $storeid, $scfg, $vmid, $content_types) = @_;
> +
> +    my $path = $scfg->{path};
> +
> +    my $res = [];
> +    for my $type ($content_types->@*) {
> +	next if $type ne 'backup';
> +
> +	my @guest_dirs = glob("$path/*");
> +	for my $guest_dir (@guest_dirs) {
> +	    next if !-d $guest_dir || $guest_dir !~ m!/(\d+)$!;
> +
> +	    my $backup_vmid = basename($guest_dir);
> +
> +	    next if defined($vmid) && $backup_vmid != $vmid;
> +
> +	    my @backup_dirs = glob("$guest_dir/*");
> +	    for my $backup_dir (@backup_dirs) {
> +		next if !-d $backup_dir || $backup_dir !~ m!/(lxc|qemu)-(\d+)$!;
> +		my ($subtype, $backup_id) = ($1, $2);
> +
> +		my $size = 0;
> +		my @backup_files = glob("$backup_dir/*");
> +		$size += -s $_ for @backup_files;
> +
> +		push $res->@*, {
> +		    volid => "$storeid:backup/${backup_vmid}/${subtype}-${backup_id}",
> +		    vmid => $backup_vmid,
> +		    format => "directory",
> +		    ctime => $backup_id,
> +		    size => $size,
> +		    subtype => $subtype,
> +		    content => $type,
> +		    # TODO parent for incremental
> +		};
> +	    }
> +	}
> +    }
> +
> +    return $res;
> +}
> +
> +sub activate_storage {
> +    my ($class, $storeid, $scfg, $cache) = @_;
> +
> +    my $path = $scfg->{path};
> +
> +    my $timeout = 2;
> +    if (!PVE::Tools::run_fork_with_timeout($timeout, sub {-d $path})) {
> +	die "unable to activate storage '$storeid' - directory '$path' does not exist or is"
> +	    ." unreachable\n";
> +    }
> +
> +    return 1;
> +}
> +
> +sub deactivate_storage {
> +    my ($class, $storeid, $scfg, $cache) = @_;
> +
> +    return 1;
> +}
> +
> +sub activate_volume {
> +    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
> +
> +    die "volume snapshot is not possible on backup-provider-dir-example volume" if $snapname;
> +
> +    return 1;
> +}
> +
> +sub deactivate_volume {
> +    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
> +
> +    die "volume snapshot is not possible on backup-provider-dir-example volume" if $snapname;
> +
> +    return 1;
> +}
> +
> +sub get_volume_attribute {
> +    my ($class, $scfg, $storeid, $volname, $attribute) = @_;
> +
> +    return;
> +}
> +
> +# NOTE a proper backup storage should implement this to support backup notes and
> +# setting protected status.
> +sub update_volume_attribute {
> +    my ($class, $scfg, $storeid, $volname, $attribute, $value) = @_;
> +
> +    die "attribute '$attribute' is not supported on backup-provider-dir-example volume";
> +}
> +
> +sub volume_size_info {
> +    my ($class, $scfg, $storeid, $volname, $timeout) = @_;
> +
> +    my (undef, $relative_backup_dir) = $class->parse_volname($volname);
> +    my ($ctime) = $relative_backup_dir =~ m/-(\d+)$/;
> +    my $backup_dir = "$scfg->{path}/${relative_backup_dir}";
> +
> +    my $size = 0;
> +    my @backup_files = glob("$backup_dir/*");
> +    for my $backup_file (@backup_files) {
> +	if ($backup_file =~ m!\.qcow2$!) {
> +	    $size += $class->file_size_info($backup_file);
> +	} else {
> +	    $size += -s $backup_file;
> +	}
> +    }
> +
> +    my $parent; # TODO for incremental
> +
> +    return wantarray ? ($size, 'directory', $size, $parent, $ctime) : $size;
> +}
> +
> +sub volume_resize {
> +    my ($class, $scfg, $storeid, $volname, $size, $running) = @_;
> +
> +    die "volume resize is not possible on backup-provider-dir-example volume";
> +}
> +
> +sub volume_snapshot {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    die "volume snapshot is not possible on backup-provider-dir-example volume";
> +}
> +
> +sub volume_snapshot_rollback {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    die "volume snapshot rollback is not possible on backup-provider-dir-example volume";
> +}
> +
> +sub volume_snapshot_delete {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    die "volume snapshot delete is not possible on backup-provider-dir-example volume";
> +}
> +
> +sub volume_has_feature {
> +    my ($class, $scfg, $feature, $storeid, $volname, $snapname, $running) = @_;
> +
> +    return 0;
> +}
> +
> +sub new_backup_provider {
> +    my ($class, $scfg, $storeid, $bandwidth_limit, $log_function) = @_;
> +
> +    return PVE::BackupProvider::Plugin::DirectoryExample->new(
> +	$class, $scfg, $storeid, $bandwidth_limit, $log_function);
> +}
> +
> +1;
> diff --git a/src/PVE/Storage/Custom/Makefile b/src/PVE/Storage/Custom/Makefile
> new file mode 100644
> index 0000000..c1e3eca
> --- /dev/null
> +++ b/src/PVE/Storage/Custom/Makefile
> @@ -0,0 +1,5 @@
> +SOURCES = BackupProviderDirExamplePlugin.pm
> +
> +.PHONY: install
> +install:
> +	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/Storage/Custom/$$i; done
> diff --git a/src/PVE/Storage/Makefile b/src/PVE/Storage/Makefile
> index d5cc942..acd37f4 100644
> --- a/src/PVE/Storage/Makefile
> +++ b/src/PVE/Storage/Makefile
> @@ -19,4 +19,5 @@ SOURCES= \
>  .PHONY: install
>  install:
>  	for i in ${SOURCES}; do install -D -m 0644 $$i ${DESTDIR}${PERLDIR}/PVE/Storage/$$i; done
> +	make -C Custom install
>  	make -C LunCmd install
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [POC storage v3 15/34] WIP Borg plugin
  2024-11-07 16:51 ` [pve-devel] [POC storage v3 15/34] WIP Borg plugin Fiona Ebner
@ 2024-11-13 10:52   ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-13 10:52 UTC (permalink / raw)
  To: Proxmox VE development discussion

On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> Archive names start with the guest type and ID and then the same
> timestamp format as PBS.
> 
> Container archives have the following structure:
> guest.config
> firewall.config
> filesystem/ # containing the whole filesystem structure
> 
> VM archives have the following structure
> guest.config
> firewall.config
> volumes/ # containing a raw file for each device
> 
> A bindmount (respectively symlinks) are used to achieve this
> structure, because Borg doesn't seem to support renaming on-the-fly.
> (Prefix stripping via the "slashdot hack" would have helped slightly,
> but is only in Borg >= 1.4
> https://github.com/borgbackup/borg/actions/runs/7967940995)
> 
> NOTE: Bandwidth limit is not yet honored and the task size is not
> calculated yet. Discard for VM backups would also be nice to have, but
> it's not entirely clear how (parsing progress and discarding according
> to that is one idea). There is no dirty bitmap support, not sure if
> that is feasible to add.
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> 
> Changes in v3:
> * make SSH work.
> * adapt to API changes, i.e. config as raw data and user namespace
>   execution context for containers.
> 
>  src/PVE/API2/Storage/Config.pm         |   2 +-
>  src/PVE/BackupProvider/Plugin/Borg.pm  | 439 ++++++++++++++++++
>  src/PVE/BackupProvider/Plugin/Makefile |   2 +-
>  src/PVE/Storage.pm                     |   2 +
>  src/PVE/Storage/BorgBackupPlugin.pm    | 595 +++++++++++++++++++++++++
>  src/PVE/Storage/Makefile               |   1 +
>  6 files changed, 1039 insertions(+), 2 deletions(-)
>  create mode 100644 src/PVE/BackupProvider/Plugin/Borg.pm
>  create mode 100644 src/PVE/Storage/BorgBackupPlugin.pm
> 
> diff --git a/src/PVE/API2/Storage/Config.pm b/src/PVE/API2/Storage/Config.pm
> index e04b6ab..1cbf09d 100755
> --- a/src/PVE/API2/Storage/Config.pm
> +++ b/src/PVE/API2/Storage/Config.pm
> @@ -190,7 +190,7 @@ __PACKAGE__->register_method ({
>  	return &$api_storage_config($cfg, $param->{storage});
>      }});
>  
> -my $sensitive_params = [qw(password encryption-key master-pubkey keyring)];
> +my $sensitive_params = [qw(password encryption-key master-pubkey keyring ssh-key)];
>  
>  __PACKAGE__->register_method ({
>      name => 'create',
> diff --git a/src/PVE/BackupProvider/Plugin/Borg.pm b/src/PVE/BackupProvider/Plugin/Borg.pm
> new file mode 100644
> index 0000000..7bb3ae3
> --- /dev/null
> +++ b/src/PVE/BackupProvider/Plugin/Borg.pm
> @@ -0,0 +1,439 @@
> +package PVE::BackupProvider::Plugin::Borg;
> +
> +use strict;
> +use warnings;
> +
> +use File::chdir;
> +use File::Basename qw(basename);
> +use File::Path qw(make_path remove_tree);
> +use Net::IP;
> +use POSIX qw(strftime);
> +
> +use PVE::Tools;
> +
> +# ($vmtype, $vmid, $time_string)
> +our $ARCHIVE_RE_3 = qr!^pve-(lxc|qemu)-([0-9]+)-([0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z)$!;
> +
> +sub archive_name {
> +    my ($vmtype, $vmid, $backup_time) = @_;
> +
> +    return "pve-${vmtype}-${vmid}-" . strftime("%FT%TZ", gmtime($backup_time));
> +}
> +
> +# remove_tree can be very verbose by default, do explicit error handling and limit to one message
> +my sub _remove_tree {
> +    my ($path) = @_;
> +
> +    remove_tree($path, { error => \my $err });
> +    if ($err && @$err) { # empty array if no error
> +	for my $diag (@$err) {
> +	    my ($file, $message) = %$diag;
> +	    die "cannot remove_tree '$path': $message\n" if $file eq '';
> +	    die "cannot remove_tree '$path': unlinking $file failed - $message\n";
> +	}
> +    }
> +}
> +
> +my sub prepare_run_dir {
> +    my ($archive, $operation) = @_;
> +
> +    my $run_dir = "/run/pve-storage-borg-plugin/${archive}.${operation}";
> +    _remove_tree($run_dir);
> +    make_path($run_dir);
> +    die "unable to create directory $run_dir\n" if !-d $run_dir;

this is used as part of restoring - what if I restore the same archive
in parallel into two different VMIDs?

> +
> +    return $run_dir;
> +}
> +
> +my sub log_info {
> +    my ($self, $message) = @_;
> +
> +    $self->{'log-function'}->('info', $message);
> +}
> +
> +my sub log_warning {
> +    my ($self, $message) = @_;
> +
> +    $self->{'log-function'}->('warn', $message);
> +}
> +
> +my sub log_error {
> +    my ($self, $message) = @_;
> +
> +    $self->{'log-function'}->('err', $message);
> +}
> +
> +my sub file_contents_from_archive {
> +    my ($self, $archive, $file) = @_;
> +
> +    my $run_dir = prepare_run_dir($archive, "file-contents");
> +
> +    my $raw;
> +
> +    eval {
> +	local $CWD = $run_dir;
> +
> +	$self->{'storage-plugin'}->borg_cmd_extract(
> +	    $self->{scfg},
> +	    $self->{storeid},
> +	    $archive,
> +	    [$file],
> +	);

borg extract has `--stdout`, which would save writing to the FS here
(since this is only used to extract config file, it should be okay)?

> +
> +	$raw = PVE::Tools::file_get_contents("${run_dir}/${file}");
> +    };
> +    my $err = $@;
> +    eval { _remove_tree($run_dir); };
> +    log_warning($self, $@) if $@;
> +    die $err if $err;
> +
> +    return $raw;
> +}
> +
> +# Plugin implementation
> +
> +sub new {
> +    my ($class, $storage_plugin, $scfg, $storeid, $log_function) = @_;
> +
> +    my $self = bless {
> +	scfg => $scfg,
> +	storeid => $storeid,
> +	'storage-plugin' => $storage_plugin,
> +	'log-function' => $log_function,
> +    }, $class;
> +
> +    return $self;
> +}
> +
> +sub provider_name {
> +    my ($self) = @_;
> +
> +    return "Borg";
> +}
> +
> +sub job_hook {
> +    my ($self, $phase, $info) = @_;
> +
> +    if ($phase eq 'start') {
> +	$self->{'job-id'} = $info->{'start-time'};
> +	$self->{password} = $self->{'storage-plugin'}->borg_get_password(
> +	    $self->{scfg}, $self->{storeid});
> +	$self->{'ssh-key-fh'} = $self->{'storage-plugin'}->borg_open_ssh_key(
> +	    $self->{scfg}, $self->{storeid});
> +    } else {
> +	delete $self->{password};

why do we delete this, but don't close the ssh-key-fh ?

> +    }
> +
> +    return;
> +}
> +
> +sub backup_hook {
> +    my ($self, $phase, $vmid, $vmtype, $info) = @_;
> +
> +    if ($phase eq 'start') {
> +	$self->{$vmid}->{'task-size'} = 0;
> +    } elsif ($phase eq 'prepare') {
> +	if ($vmtype eq 'lxc') {
> +	    my $archive = $self->{$vmid}->{archive};
> +	    my $run_dir = prepare_run_dir($archive, "backup-container");
> +	    $self->{$vmid}->{'run-dir'} = $run_dir;
> +
> +	    my $create_dir = sub {
> +		my $dir = shift;
> +		make_path($dir);
> +		die "unable to create directory $dir\n" if !-d $dir;
> +		chown($info->{'backup-user-id'}, -1, $dir)
> +		    or die "unable to change owner for $dir\n";
> +	    };
> +
> +	    $create_dir->("${run_dir}/backup/");
> +	    $create_dir->("${run_dir}/backup/filesystem");
> +	    $create_dir->("${run_dir}/ssh");
> +	    $create_dir->("${run_dir}/.config");
> +	    $create_dir->("${run_dir}/.cache");

so this is a bit tricky.. we need unpriv access (to do the backup), but
we store sensitive things here that we don't actually want to hand out
to everyone..

> +
> +	    for my $subdir ($info->{sources}->@*) {
> +		PVE::Tools::run_command([
> +		    'mount',
> +		    '-o', 'bind,ro',
> +		    "$info->{directory}/${subdir}",
> +		    "${run_dir}/backup/filesystem/${subdir}",
> +		]);
> +	    }
> +	}
> +    } elsif ($phase eq 'end' || $phase eq 'abort') {
> +	if ($vmtype eq 'lxc') {
> +	    my $run_dir = $self->{$vmid}->{'run-dir'};
> +	    eval {
> +		eval { PVE::Tools::run_command(['umount', "${run_dir}/ssh"]); };

this might warrant a comment ;) a tmpfs is mounted there in
backup_container..

> +		eval { PVE::Tools::run_command(['umount', '-R', "${run_dir}/backup/filesystem"]); };
> +		_remove_tree($run_dir);
> +	    };
> +	    die "unable to clean up $run_dir - $@" if $@;
> +	}
> +    }
> +
> +    return;
> +}
> +
> +sub backup_get_mechanism {
> +    my ($self, $vmid, $vmtype) = @_;
> +
> +    return ('block-device', undef) if $vmtype eq 'qemu';
> +    return ('directory', undef) if $vmtype eq 'lxc';
> +
> +    die "unsupported VM type '$vmtype'\n";
> +}
> +
> +sub backup_get_archive_name {
> +    my ($self, $vmid, $vmtype, $backup_time) = @_;
> +
> +    return $self->{$vmid}->{archive} = archive_name($vmtype, $vmid, $backup_time);
> +}
> +
> +sub backup_get_task_size {
> +    my ($self, $vmid) = @_;
> +
> +    return $self->{$vmid}->{'task-size'};
> +}
> +
> +sub backup_handle_log_file {
> +    my ($self, $vmid, $filename) = @_;
> +
> +    return; # don't upload, Proxmox VE keeps the task log too
> +}
> +
> +sub backup_vm {
> +    my ($self, $vmid, $guest_config, $volumes, $info) = @_;
> +
> +    # TODO honor bandwith limit
> +    # TODO discard?
> +
> +    my $archive = $self->{$vmid}->{archive};
> +
> +    my $run_dir = prepare_run_dir($archive, "backup-vm");
> +    my $volume_dir = "${run_dir}/volumes";
> +    make_path($volume_dir);
> +    die "unable to create directory $volume_dir\n" if !-d $volume_dir;
> +
> +    PVE::Tools::file_set_contents("${run_dir}/guest.config", $guest_config);

same here

> +    my $paths = ['./guest.config'];
> +
> +    if (my $firewall_config = $info->{'firewall-config'}) {
> +	PVE::Tools::file_set_contents("${run_dir}/firewall.config", $firewall_config);

and here - these paths are world-readable by default..

> +	push $paths->@*, './firewall.config';
> +    }
> +
> +    for my $devicename (sort keys $volumes->%*) {
> +	my $path = $volumes->{$devicename}->{path};
> +	my $link_name = "${volume_dir}/${devicename}.raw";
> +	symlink($path, $link_name) or die "could not create symlink $link_name -> $path\n";
> +	push $paths->@*, "./volumes/" . basename($link_name, ());
> +    }
> +
> +    # TODO --stats for size?
> +
> +    eval {
> +	local $CWD = $run_dir;
> +
> +	$self->{'storage-plugin'}->borg_cmd_create(
> +	    $self->{scfg},
> +	    $self->{storeid},
> +	    $self->{$vmid}->{archive},
> +	    $paths,
> +	    ['--read-special', '--progress'],
> +	);
> +    };
> +    my $err = $@;
> +    eval { _remove_tree($run_dir) };
> +    log_warning($self, $@) if $@;
> +    die $err if $err;
> +}
> +
> +sub backup_container {
> +    my ($self, $vmid, $guest_config, $exclude_patterns, $info) = @_;
> +
> +    # TODO honor bandwith limit
> +
> +    my $run_dir = $self->{$vmid}->{'run-dir'};
> +    my $backup_dir = "${run_dir}/backup";
> +
> +    my $archive = $self->{$vmid}->{archive};
> +
> +    PVE::Tools::run_command(['mount', '-t', 'tmpfs', '-o', 'size=1M', 'tmpfs', "${run_dir}/ssh"]);
> +
> +    if ($self->{'ssh-key-fh'}) {
> +	my $ssh_key =
> +	    PVE::Tools::safe_read_from($self->{'ssh-key-fh'}, 1024 * 1024, 0, "SSH key file");
> +	PVE::Tools::file_set_contents("${run_dir}/ssh/ssh.key", $ssh_key, 0600);

okay, so this should be fine..

> +    }
> +
> +    if (my $ssh_fingerprint = $self->{scfg}->{'ssh-fingerprint'}) {
> +	my ($server, $port) = $self->{scfg}->@{qw(server port)};
> +	$server = "[$server]" if Net::IP::ip_is_ipv6($server);
> +	$server = "${server}:${port}" if $port;
> +	my $fp_line = "$server $ssh_fingerprint\n";
> +	PVE::Tools::file_set_contents("${run_dir}/ssh/known_hosts", $fp_line, 0600);
> +    }
> +
> +    PVE::Tools::file_set_contents("${backup_dir}/guest.config", $guest_config);

but this

> +    my $paths = ['./guest.config'];
> +
> +    if (my $firewall_config = $info->{'firewall-config'}) {
> +	PVE::Tools::file_set_contents("${backup_dir}/firewall.config", $firewall_config);

and this should also be 0600? or we could chmod the dirs themselves when
creating, to avoid missing paths?

> +	push $paths->@*, './firewall.config';
> +    }
> +
> +    push $paths->@*, "./filesystem";
> +
> +    my $opts = ['--numeric-ids', '--sparse', '--progress'];
> +
> +    for my $pattern ($exclude_patterns->@*) {
> +	if ($pattern =~ m|^/|) {
> +	    push $opts->@*, '-e', "filesystem${pattern}";
> +	} else {
> +	    push $opts->@*, '-e', "filesystem/**${pattern}";
> +	}
> +    }
> +
> +    push $opts->@*, '-e', "filesystem/**lost+found" if $info->{'backup-user-id'} != 0;
> +
> +    # TODO --stats for size?
> +
> +    # Don't make it local to avoid permission denied error when changing back, because the method is
> +    # executed in a user namespace.
> +    $CWD = $backup_dir if $info->{'backup-user-id'} != 0;
> +    {
> +	local $CWD = $backup_dir;
> +	local $ENV{BORG_BASE_DIR} = ${run_dir};
> +	local $ENV{BORG_PASSPHRASE} = $self->{password};
> +
> +	local $ENV{BORG_RSH} =
> +	    "ssh -o \"UserKnownHostsFile ${run_dir}/ssh/known_hosts\" -i ${run_dir}/ssh/ssh.key";
> +
> +	$self->{'storage-plugin'}->borg_cmd_create(
> +	    $self->{scfg},
> +	    $self->{storeid},
> +	    $self->{$vmid}->{archive},
> +	    $paths,
> +	    $opts,
> +	);
> +    }
> +}
> +
> +sub restore_get_mechanism {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $archive) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my ($vmtype) = $archive =~ m!^pve-([^\s-]+)!
> +	or die "cannot parse guest type from archive name '$archive'\n";
> +
> +    return ('qemu-img', $vmtype) if $vmtype eq 'qemu';
> +    return ('directory', $vmtype) if $vmtype eq 'lxc';
> +
> +    die "unexpected guest type '$vmtype'\n";
> +}
> +
> +sub restore_get_guest_config {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $archive) = $self->{'storage-plugin'}->parse_volname($volname);
> +    return file_contents_from_archive($self, $archive, 'guest.config');
> +}
> +
> +sub restore_get_firewall_config {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my (undef, $archive) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $config = eval {
> +	file_contents_from_archive($self, $archive, 'firewall.config');
> +    };
> +    if (my $err = $@) {
> +	return if $err =~ m!Include pattern 'firewall\.config' never matched\.!;
> +	die $err;
> +    }
> +    return $config;
> +}
> +
> +sub restore_vm_init {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my $res = {};
> +
> +    my (undef, $archive, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $mount_point = prepare_run_dir($archive, "restore-vm");
> +
> +    $self->{'storage-plugin'}->borg_cmd_mount(
> +	$self->{scfg},
> +	$self->{storeid},
> +	$archive,
> +	$mount_point,
> +    );

haven't actually tested this code, but what are the permissions like for
this mounted backup archive contents? we don't want to expose guest
volumes as world-readable either..

> +
> +    my @backup_files = glob("$mount_point/volumes/*");
> +    for my $backup_file (@backup_files) {
> +	next if $backup_file !~ m!^(.*/(.*)\.raw)$!; # untaint
> +	($backup_file, my $devicename) = ($1, $2);
> +	# TODO avoid dependency on base plugin?
> +	$res->{$devicename}->{size} = PVE::Storage::Plugin::file_size_info($backup_file);
> +    }
> +
> +    $self->{$volname}->{'mount-point'} = $mount_point;
> +
> +    return $res;
> +}
> +
> +sub restore_vm_cleanup {
> +    my ($self, $volname, $storeid) = @_;
> +
> +    my $mount_point = $self->{$volname}->{'mount-point'} or return;
> +
> +    PVE::Tools::run_command(['umount', $mount_point]);
> +
> +    return;
> +}
> +
> +sub restore_vm_volume_init {
> +    my ($self, $volname, $storeid, $devicename, $info) = @_;
> +
> +    my $mount_point = $self->{$volname}->{'mount-point'}
> +	or die "expected mount point for archive not present\n";
> +
> +    return { 'qemu-img-path' => "${mount_point}/volumes/${devicename}.raw" };
> +}
> +
> +sub restore_vm_volume_cleanup {
> +    my ($self, $volname, $storeid, $devicename, $info) = @_;
> +
> +    return;
> +}
> +
> +sub restore_container_init {
> +    my ($self, $volname, $storeid, $info) = @_;
> +
> +    my (undef, $archive, $vmid) = $self->{'storage-plugin'}->parse_volname($volname);
> +    my $mount_point = prepare_run_dir($archive, "restore-container");
> +
> +    $self->{'storage-plugin'}->borg_cmd_mount(
> +	$self->{scfg},
> +	$self->{storeid},
> +	$archive,
> +	$mount_point,
> +    );

same question here..

> +
> +    $self->{$volname}->{'mount-point'} = $mount_point;
> +
> +    return { 'archive-directory' => "${mount_point}/filesystem" };
> +}
> +
> +sub restore_container_cleanup {
> +    my ($self, $volname, $storeid, $info) = @_;
> +
> +    my $mount_point = $self->{$volname}->{'mount-point'} or return;
> +
> +    PVE::Tools::run_command(['umount', $mount_point]);
> +
> +    return;
> +}
> +
> +1;
> diff --git a/src/PVE/BackupProvider/Plugin/Makefile b/src/PVE/BackupProvider/Plugin/Makefile
> index bedc26e..db08c2d 100644
> --- a/src/PVE/BackupProvider/Plugin/Makefile
> +++ b/src/PVE/BackupProvider/Plugin/Makefile
> @@ -1,4 +1,4 @@
> -SOURCES = Base.pm DirectoryExample.pm
> +SOURCES = Base.pm Borg.pm DirectoryExample.pm
>  
>  .PHONY: install
>  install:
> diff --git a/src/PVE/Storage.pm b/src/PVE/Storage.pm
> index 9f9a86b..f4bfc55 100755
> --- a/src/PVE/Storage.pm
> +++ b/src/PVE/Storage.pm
> @@ -40,6 +40,7 @@ use PVE::Storage::ZFSPlugin;
>  use PVE::Storage::PBSPlugin;
>  use PVE::Storage::BTRFSPlugin;
>  use PVE::Storage::ESXiPlugin;
> +use PVE::Storage::BorgBackupPlugin;
>  
>  # Storage API version. Increment it on changes in storage API interface.
>  use constant APIVER => 11;
> @@ -66,6 +67,7 @@ PVE::Storage::ZFSPlugin->register();
>  PVE::Storage::PBSPlugin->register();
>  PVE::Storage::BTRFSPlugin->register();
>  PVE::Storage::ESXiPlugin->register();
> +PVE::Storage::BorgBackupPlugin->register();
>  
>  # load third-party plugins
>  if ( -d '/usr/share/perl5/PVE/Storage/Custom' ) {
> diff --git a/src/PVE/Storage/BorgBackupPlugin.pm b/src/PVE/Storage/BorgBackupPlugin.pm
> new file mode 100644
> index 0000000..8f0e721
> --- /dev/null
> +++ b/src/PVE/Storage/BorgBackupPlugin.pm
> @@ -0,0 +1,595 @@
> +package PVE::Storage::BorgBackupPlugin;
> +
> +use strict;
> +use warnings;
> +
> +use Fcntl qw(F_GETFD F_SETFD FD_CLOEXEC);
> +use JSON qw(from_json);
> +use MIME::Base64 qw(decode_base64);
> +use Net::IP;
> +use POSIX;
> +
> +use PVE::BackupProvider::Plugin::Borg;
> +use PVE::Tools;
> +
> +use base qw(PVE::Storage::Plugin);
> +
> +my sub borg_repository_uri {
> +    my ($scfg, $storeid) = @_;
> +
> +    my $uri = '';
> +    my $server = $scfg->{server} or die "no server configured for $storeid\n";
> +    my $username = $scfg->{username} or die "no username configured for $storeid\n";
> +    my $prefix = "ssh://$username@";
> +    $server = "[$server]" if Net::IP::ip_is_ipv6($server);
> +    if (my $port = $scfg->{port}) {
> +	$uri = "${prefix}${server}:${port}";
> +    } else {
> +	$uri = "${prefix}${server}";
> +    }
> +    $uri .= $scfg->{'repository-path'};
> +
> +    return $uri;
> +}
> +
> +my sub borg_password_file_name {
> +    my ($scfg, $storeid) = @_;
> +
> +    return "/etc/pve/priv/storage/${storeid}.pw";
> +}
> +
> +my sub borg_set_password {
> +    my ($scfg, $storeid, $password) = @_;
> +
> +    my $pwfile = borg_password_file_name($scfg, $storeid);
> +    mkdir "/etc/pve/priv/storage";
> +
> +    PVE::Tools::file_set_contents($pwfile, "$password\n");
> +}
> +
> +my sub borg_delete_password {
> +    my ($scfg, $storeid) = @_;
> +
> +    my $pwfile = borg_password_file_name($scfg, $storeid);
> +
> +    unlink $pwfile;
> +}
> +
> +sub borg_get_password {
> +    my ($class, $scfg, $storeid) = @_;
> +
> +    my $pwfile = borg_password_file_name($scfg, $storeid);
> +
> +    return PVE::Tools::file_read_firstline($pwfile);
> +}
> +
> +sub borg_cmd_list {
> +    my ($class, $scfg, $storeid) = @_;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +
> +    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
> +	if !$ENV{BORG_PASSPHRASE};
> +
> +    my $json = '';
> +    my $cmd = ['borg', 'list', '--json', $uri];
> +
> +    my $errfunc = sub { warn $_[0]; };
> +    my $outfunc = sub { $json .= $_[0]; };
> +
> +    PVE::Tools::run_command(
> +	$cmd, errmsg => "command @$cmd failed", outfunc => $outfunc, errfunc => $errfunc);
> +
> +    my $res = eval { from_json($json) };
> +    die "unable to parse 'borg list' output - $@\n" if $@;
> +    return $res;
> +}
> +
> +sub borg_cmd_create {
> +    my ($class, $scfg, $storeid, $archive, $paths, $opts) = @_;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +
> +    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
> +	if !$ENV{BORG_PASSPHRASE};
> +
> +    my $cmd = ['borg', 'create', $opts->@*, "${uri}::${archive}", $paths->@*];
> +
> +    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
> +
> +    return;
> +}
> +
> +sub borg_cmd_extract {
> +    my ($class, $scfg, $storeid, $archive, $paths) = @_;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +
> +    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
> +	if !$ENV{BORG_PASSPHRASE};
> +
> +    my $cmd = ['borg', 'extract', "${uri}::${archive}", $paths->@*];
> +
> +    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
> +
> +    return;
> +}
> +
> +sub borg_cmd_delete {
> +    my ($class, $scfg, $storeid, $archive) = @_;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +
> +    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
> +	if !$ENV{BORG_PASSPHRASE};
> +
> +    my $cmd = ['borg', 'delete', "${uri}::${archive}"];
> +
> +    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
> +
> +    return;
> +}
> +
> +sub borg_cmd_info {
> +    my ($class, $scfg, $storeid, $archive, $timeout) = @_;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +
> +    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
> +	if !$ENV{BORG_PASSPHRASE};
> +
> +    my $json = '';
> +    my $cmd = ['borg', 'info', '--json', "${uri}::${archive}"];
> +
> +    my $errfunc = sub { warn $_[0]; };
> +    my $outfunc = sub { $json .= $_[0]; };
> +
> +    PVE::Tools::run_command(
> +	$cmd,
> +	errmsg => "command @$cmd failed",
> +	timeout => $timeout,
> +	outfunc => $outfunc,
> +	errfunc => $errfunc,
> +    );
> +
> +    my $res = eval { from_json($json) };
> +    die "unable to parse 'borg info' output for archive '$archive' - $@\n" if $@;
> +    return $res;
> +}
> +
> +sub borg_cmd_mount {
> +    my ($class, $scfg, $storeid, $archive, $mount_point) = @_;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +
> +    local $ENV{BORG_PASSPHRASE} = $class->borg_get_password($scfg, $storeid)
> +	if !$ENV{BORG_PASSPHRASE};
> +
> +    my $cmd = ['borg', 'mount', "${uri}::${archive}", $mount_point];
> +
> +    PVE::Tools::run_command($cmd, errmsg => "command @$cmd failed");
> +
> +    return;
> +}
> +
> +my sub parse_backup_time {
> +    my ($time_string) = @_;
> +
> +    my @tm = (POSIX::strptime($time_string, "%FT%TZ"));
> +    # expect sec, min, hour, mday, mon, year
> +    if (grep { !defined($_) } @tm[0..5]) {
> +	warn "error parsing time from string '$time_string'\n";
> +	return 0;
> +    } else {
> +	local $ENV{TZ} = 'UTC'; # time string is UTC
> +
> +	# Fill in isdst to avoid undef warning. No daylight saving time for UTC.
> +	$tm[8] //= 0;
> +
> +	if (my $since_epoch = mktime(@tm)) {
> +	    return int($since_epoch);
> +	} else {
> +	    warn "error parsing time from string '$time_string'\n";
> +	    return 0;
> +	}
> +    }
> +}
> +
> +# Helpers
> +
> +sub type {
> +    return 'borg';
> +}
> +
> +sub plugindata {
> +    return {
> +	content => [ { backup => 1, none => 1 }, { backup => 1 } ],
> +	features => { 'backup-provider' => 1 },
> +    };
> +}
> +
> +sub properties {
> +    return {
> +	'repository-path' => {
> +	    description => "Path to the backup repository",
> +	    type => 'string',
> +	},
> +	'ssh-key' => {
> +	    description => "FIXME", # FIXME
> +	    type => 'string',
> +	},
> +	'ssh-fingerprint' => {
> +	    description => "FIXME", # FIXME
> +	    type => 'string',
> +	},

these should probably get descriptions and formats, but this is titled
WIP :)

> +    };
> +}
> +
> +sub options {
> +    return {
> +	'repository-path' => { fixed => 1 },
> +	server => { fixed => 1 },
> +	port => { optional => 1 },
> +	username => { fixed => 1 },
> +	'ssh-key' => { optional => 1 },
> +	'ssh-fingerprint' => { optional => 1 },
> +	password => { optional => 1 },
> +	disable => { optional => 1 },
> +	nodes => { optional => 1 },
> +	'prune-backups' => { optional => 1 },
> +	'max-protected-backups' => { optional => 1 },
> +    };
> +}
> +
> +sub borg_ssh_key_file_name {
> +    my ($scfg, $storeid) = @_;
> +
> +    return "/etc/pve/priv/storage/${storeid}.ssh.key";
> +}
> +
> +sub borg_set_ssh_key {
> +    my ($scfg, $storeid, $key) = @_;
> +
> +    my $pwfile = borg_ssh_key_file_name($scfg, $storeid);

nit: variable name

> +    mkdir "/etc/pve/priv/storage";
> +
> +    PVE::Tools::file_set_contents($pwfile, "$key\n");
> +}
> +
> +sub borg_delete_ssh_key {
> +    my ($scfg, $storeid) = @_;
> +
> +    my $pwfile = borg_ssh_key_file_name($scfg, $storeid);

same

> +
> +    if (!unlink $pwfile) {
> +	return if $! == ENOENT;
> +	die "failed to delete SSH key! $!\n";
> +    }
> +    delete $scfg->{'ssh-key'};
> +}
> +
> +sub borg_get_ssh_key {
> +    my ($scfg, $storeid) = @_;
> +
> +    my $pwfile = borg_ssh_key_file_name($scfg, $storeid);

same

> +
> +    return PVE::Tools::file_get_contents($pwfile);
> +}
> +
> +# Returns a file handle with FD_CLOEXEC disabled if there is an SSH key , or `undef` if there is
> +# not. Dies on error.
> +sub borg_open_ssh_key {
> +    my ($self, $scfg, $storeid) = @_;
> +
> +    my $ssh_key_file = borg_ssh_key_file_name($scfg, $storeid);
> +
> +    my $keyfd;
> +    if (!open($keyfd, '<', $ssh_key_file)) {
> +	if ($! == ENOENT) {
> +	    die "SSH key configured but no key file found!\n" if $scfg->{'ssh-key'};
> +	    return undef;
> +	}
> +	die "failed to open SSH key: $ssh_key_file: $!\n";
> +    }
> +    my $flags = fcntl($keyfd, F_GETFD, 0)
> +	// die "failed to get file descriptor flags for SSH key FD: $!\n";
> +    fcntl($keyfd, F_SETFD, $flags & ~FD_CLOEXEC)
> +	or die "failed to remove FD_CLOEXEC from SSH key file descriptor\n";
> +
> +    return $keyfd;
> +}
> +
> +# Storage implementation
> +
> +sub on_add_hook {
> +    my ($class, $storeid, $scfg, %param) = @_;
> +
> +    if (defined(my $password = $param{password})) {
> +	borg_set_password($scfg, $storeid, $password);
> +    } else {
> +	borg_delete_password($scfg, $storeid);
> +    }
> +
> +    if (defined(my $ssh_key = delete $param{'ssh-key'})) {
> +	my $decoded = decode_base64($ssh_key);
> +	borg_set_ssh_key($scfg, $storeid, $decoded);
> +	$scfg->{'ssh-key'} = 1;
> +    } else {
> +	borg_delete_ssh_key($scfg, $storeid);
> +    }
> +
> +    return;
> +}
> +
> +sub on_update_hook {
> +    my ($class, $storeid, $scfg, %param) = @_;
> +
> +    if (exists($param{password})) {
> +	if (defined($param{password})) {
> +	    borg_set_password($scfg, $storeid, $param{password});
> +	} else {
> +	    borg_delete_password($scfg, $storeid);
> +	}
> +    }
> +
> +    if (exists($param{'ssh-key'})) {
> +	if (defined(my $ssh_key = delete($param{'ssh-key'}))) {
> +	    my $decoded = decode_base64($ssh_key);
> +
> +	    borg_set_ssh_key($scfg, $storeid, $decoded);
> +	    $scfg->{'ssh-key'} = 1;
> +	} else {
> +	    borg_delete_ssh_key($scfg, $storeid);
> +	}
> +    }
> +
> +    return;
> +}
> +
> +sub on_delete_hook {
> +    my ($class, $storeid, $scfg) = @_;
> +
> +    borg_delete_password($scfg, $storeid);
> +    borg_delete_ssh_key($scfg, $storeid);
> +
> +    return;
> +}
> +
> +sub prune_backups {
> +    my ($class, $scfg, $storeid, $keep, $vmid, $type, $dryrun, $logfunc) = @_;
> +
> +    # FIXME - is 'borg prune' compatible with ours?
> +    die "not implemented";
> +}
> +
> +sub parse_volname {
> +    my ($class, $volname) = @_;
> +
> +    if ($volname =~ m!^backup/(.*)$!) {
> +	my $archive = $1;
> +	if ($archive =~ $PVE::BackupProvider::Plugin::Borg::ARCHIVE_RE_3) {
> +	    return ('backup', $archive, $2);
> +	}
> +    }
> +
> +    die "unable to parse Borg volume name '$volname'\n";
> +}
> +
> +sub path {
> +    my ($class, $scfg, $volname, $storeid, $snapname) = @_;
> +
> +    die "volume snapshot is not possible on Borg volume" if $snapname;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +    my (undef, $archive) = $class->parse_volname($volname);
> +
> +    return "${uri}::${archive}";
> +}
> +
> +sub create_base {
> +    my ($class, $storeid, $scfg, $volname) = @_;
> +
> +    die "cannot create base image in Borg storage\n";
> +}
> +
> +sub clone_image {
> +    my ($class, $scfg, $storeid, $volname, $vmid, $snap) = @_;
> +
> +    die "can't clone images in Borg storage\n";
> +}
> +
> +sub alloc_image {
> +    my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
> +
> +    die "can't allocate space in Borg storage\n";
> +}
> +
> +sub free_image {
> +    my ($class, $storeid, $scfg, $volname, $isBase) = @_;
> +
> +    my (undef, $archive) = $class->parse_volname($volname);
> +
> +    borg_cmd_delete($class, $scfg, $storeid, $archive);
> +
> +    return;
> +}
> +
> +sub list_images {
> +    my ($class, $storeid, $scfg, $vmid, $vollist, $cache) = @_;
> +
> +    return []; # guest images are not supported, only backups
> +}
> +
> +sub list_volumes {
> +    my ($class, $storeid, $scfg, $vmid, $content_types) = @_;
> +
> +    my $res = [];
> +
> +    return $res if !grep { $_ eq 'backup' } $content_types->@*;
> +
> +    my $archives = $class->borg_cmd_list($scfg, $storeid)->{archives}
> +	or die "expected 'archives' key in 'borg list' JSON output missing\n";
> +
> +    for my $info ($archives->@*) {
> +	my $archive = $info->{archive};
> +	my ($vmtype, $backup_vmid, $time_string) =
> +	    $archive =~ $PVE::BackupProvider::Plugin::Borg::ARCHIVE_RE_3 or next;
> +
> +	next if defined($vmid) && $vmid != $backup_vmid;
> +
> +	push $res->@*, {
> +	    volid => "${storeid}:backup/${archive}",
> +	    size => 0, # FIXME how to cheaply get?
> +	    content => 'backup',
> +	    ctime => parse_backup_time($time_string),
> +	    vmid => $backup_vmid,
> +	    format => "borg-archive",
> +	    subtype => $vmtype,
> +	}
> +    }
> +
> +    return $res;
> +}
> +
> +sub status {
> +    my ($class, $storeid, $scfg, $cache) = @_;
> +
> +    my $uri = borg_repository_uri($scfg, $storeid);
> +
> +    my $res;
> +
> +    if ($uri =~ m!^ssh://!) {
> +	#FIXME ssh and df on target?

borg targets will often be locked down to only allow executing borg on
the other end though..

I am not sure what makes sense here tbh..

> +	return;
> +    } else { # $uri is a local path
> +	my $timeout = 2;
> +	$res = PVE::Tools::df($uri, $timeout);
> +
> +	return if !$res || !$res->{total};
> +    }
> +
> +
> +    return ($res->{total}, $res->{avail}, $res->{used}, 1);
> +}
> +
> +sub activate_storage {
> +    my ($class, $storeid, $scfg, $cache) = @_;
> +
> +    # TODO how to cheaply check? split ssh and non-ssh?
> +
> +    return 1;
> +}
> +
> +sub deactivate_storage {
> +    my ($class, $storeid, $scfg, $cache) = @_;
> +
> +    return 1;
> +}
> +
> +sub activate_volume {
> +    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
> +
> +    die "volume snapshot is not possible on Borg volume" if $snapname;
> +
> +    return 1;
> +}
> +
> +sub deactivate_volume {
> +    my ($class, $storeid, $scfg, $volname, $snapname, $cache) = @_;
> +
> +    die "volume snapshot is not possible on Borg volume" if $snapname;
> +
> +    return 1;
> +}
> +
> +sub get_volume_attribute {
> +    my ($class, $scfg, $storeid, $volname, $attribute) = @_;
> +
> +    return;
> +}
> +
> +sub update_volume_attribute {
> +    my ($class, $scfg, $storeid, $volname, $attribute, $value) = @_;
> +
> +    # FIXME notes or protected possible?
> +
> +    die "attribute '$attribute' is not supported on Borg volume";
> +}
> +
> +sub volume_size_info {
> +    my ($class, $scfg, $storeid, $volname, $timeout) = @_;
> +
> +    my (undef, $archive) = $class->parse_volname($volname);
> +    my (undef, undef, $time_string) =
> +	$archive =~ $PVE::BackupProvider::Plugin::Borg::ARCHIVE_RE_3;
> +
> +    my $backup_time = 0;
> +    if ($time_string) {
> +	$backup_time = parse_backup_time($time_string)
> +    } else {
> +	warn "could not parse time from archive name '$archive'\n";
> +    }
> +
> +    my $archives = borg_cmd_info($class, $scfg, $storeid, $archive, $timeout)->{archives}
> +	or die "expected 'archives' key in 'borg info' JSON output missing\n";
> +
> +    my $stats = eval { $archives->[0]->{stats} }
> +	or die "expected entry in 'borg info' JSON output missing\n";
> +    my ($size, $used) = $stats->@{qw(original_size deduplicated_size)};
> +
> +    ($size) = ($size =~ /^(\d+)$/); # untaint
> +    die "size '$size' not an integer\n" if !defined($size);
> +    # coerce back from string
> +    $size = int($size);
> +    ($used) = ($used =~ /^(\d+)$/); # untaint
> +    die "used '$used' not an integer\n" if !defined($used);
> +    # coerce back from string
> +    $used = int($used);
> +
> +    return wantarray ? ($size, 'borg-archive', $used, undef, $backup_time) : $size;
> +}
> +
> +sub volume_resize {
> +    my ($class, $scfg, $storeid, $volname, $size, $running) = @_;
> +
> +    die "volume resize is not possible on Borg volume";
> +}
> +
> +sub volume_snapshot {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    die "volume snapshot is not possible on Borg volume";
> +}
> +
> +sub volume_snapshot_rollback {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    die "volume snapshot rollback is not possible on Borg volume";
> +}
> +
> +sub volume_snapshot_delete {
> +    my ($class, $scfg, $storeid, $volname, $snap) = @_;
> +
> +    die "volume snapshot delete is not possible on Borg volume";
> +}
> +
> +sub volume_has_feature {
> +    my ($class, $scfg, $feature, $storeid, $volname, $snapname, $running) = @_;
> +
> +    return 0;
> +}
> +
> +sub rename_volume {
> +    my ($class, $scfg, $storeid, $source_volname, $target_vmid, $target_volname) = @_;
> +
> +    die "volume rename is not implemented in Borg storage plugin\n";
> +}
> +
> +sub new_backup_provider {
> +    my ($class, $scfg, $storeid, $bandwidth_limit, $log_function) = @_;
> +
> +    return PVE::BackupProvider::Plugin::Borg->new(
> +	$class, $scfg, $storeid, $bandwidth_limit, $log_function);
> +}
> +
> +1;
> diff --git a/src/PVE/Storage/Makefile b/src/PVE/Storage/Makefile
> index acd37f4..9fe2c66 100644
> --- a/src/PVE/Storage/Makefile
> +++ b/src/PVE/Storage/Makefile
> @@ -14,6 +14,7 @@ SOURCES= \
>  	PBSPlugin.pm \
>  	BTRFSPlugin.pm \
>  	LvmThinPlugin.pm \
> +	BorgBackupPlugin.pm \

do we want this one here, while the other one is in Custom?

>  	ESXiPlugin.pm
>  
>  .PHONY: install
> -- 
> 2.39.5
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace
  2024-11-13 10:08     ` Fiona Ebner
@ 2024-11-13 11:15       ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-13 11:15 UTC (permalink / raw)
  To: Fiona Ebner, Proxmox VE development discussion

On November 13, 2024 11:08 am, Fiona Ebner wrote:
> On 12.11.24 3:20 PM, Fabian Grünbichler wrote:
>> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>>> +sub __set_id_map($$$) {
>>> +    my ($pid, $what, $value) = @_;
>>> +    sysopen(my $fd, "/proc/$pid/${what}_map", O_WRONLY)
>>> +	or die "failed to open child process' ${what}_map\n";
>>> +    my $rc = syswrite($fd, $value);
>>> +    if (!$rc || $rc != length($value)) {
>>> +	die "failed to set sub$what: $!\n";
>>> +    }
>>> +    close($fd);
>>> +}
>>> +
>>> +sub set_id_map($$) {
>>> +    my ($pid, $id_map) = @_;
>>> +
>>> +    my $gid_map = '';
>>> +    my $uid_map = '';
>>> +
>>> +    for my $map ($id_map->@*) {
>>> +	my ($type, $ct, $host, $length) = $map->@*;
>>> +
>>> +	$gid_map .= "$ct $host $length\n" if $type eq 'g';
>>> +	$uid_map .= "$ct $host $length\n" if $type eq 'u';
>>> +    }
>>> +
>>> +    __set_id_map($pid, 'gid', $gid_map) if $gid_map;
>>> +    __set_id_map($pid, 'uid', $uid_map) if $uid_map;
>>> +}
>> 
>> do we gain a lot here from not just using newuidmap/newgidmap?
>> 
> 
> I didn't know those commands existed :P Running commands seems more
> wasteful then just writing a file, but will change if you insist.

they do check against /etc/subuid (or /etc/subgid) and provide nicer
error messages AFAICT.. and this is not really in the hot path, so I am
not sure whether the "overhead" makes much of a difference.

but I am fine with either way :)


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state
  2024-11-13  9:22     ` Fiona Ebner
  2024-11-13  9:33       ` Fiona Ebner
@ 2024-11-13 11:16       ` Fabian Grünbichler
  2024-11-13 11:40         ` Fiona Ebner
  1 sibling, 1 reply; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-13 11:16 UTC (permalink / raw)
  To: Fiona Ebner, Proxmox VE development discussion

On November 13, 2024 10:22 am, Fiona Ebner wrote:
> On 12.11.24 5:46 PM, Fabian Grünbichler wrote:
>> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>>> +    backup_state.target_id = g_strdup("Proxmox");
>> 
>> if we take this opportunity to also support multiple PBS targets while
>> we are at it, it might make sense to make this more of a "legacy" value?
>> or not set it at all here to opt into the legacy behaviour?
>> 
> 
> Why isn't "Proxmox" a good legacy value? When we add support for passing
> in a target ID to qmp_backup(), I had in mind using "PBS-$storeid" or
> something along those lines.

because it might clash with actual target IDs? that's why I thought that
maybe not setting it at all in that case provides more flexibility to
differentiate..


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state
  2024-11-13 11:16       ` Fabian Grünbichler
@ 2024-11-13 11:40         ` Fiona Ebner
  2024-11-13 12:03           ` Fabian Grünbichler
  0 siblings, 1 reply; 63+ messages in thread
From: Fiona Ebner @ 2024-11-13 11:40 UTC (permalink / raw)
  To: Fabian Grünbichler, Proxmox VE development discussion

On 13.11.24 12:16 PM, Fabian Grünbichler wrote:
> On November 13, 2024 10:22 am, Fiona Ebner wrote:
>> On 12.11.24 5:46 PM, Fabian Grünbichler wrote:
>>> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
>>>> +    backup_state.target_id = g_strdup("Proxmox");
>>>
>>> if we take this opportunity to also support multiple PBS targets while
>>> we are at it, it might make sense to make this more of a "legacy" value?
>>> or not set it at all here to opt into the legacy behaviour?
>>>
>>
>> Why isn't "Proxmox" a good legacy value? When we add support for passing
>> in a target ID to qmp_backup(), I had in mind using "PBS-$storeid" or
>> something along those lines.
> 
> because it might clash with actual target IDs? that's why I thought that
> maybe not setting it at all in that case provides more flexibility to
> differentiate..

I don't like the special casing that would entail in the C code. Having
it always set regardless of legacy or not is nicer.

How about we fix this on the qemu-server side by passing
"snapshot-access:$storeid" and, in the future, "pbs:$storeid", as
"target-id" values to QMP?


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state
  2024-11-13 11:40         ` Fiona Ebner
@ 2024-11-13 12:03           ` Fabian Grünbichler
  0 siblings, 0 replies; 63+ messages in thread
From: Fabian Grünbichler @ 2024-11-13 12:03 UTC (permalink / raw)
  To: Fiona Ebner, Proxmox VE development discussion


> Fiona Ebner <f.ebner@proxmox.com> hat am 13.11.2024 12:40 CET geschrieben:
> 
>  
> On 13.11.24 12:16 PM, Fabian Grünbichler wrote:
> > On November 13, 2024 10:22 am, Fiona Ebner wrote:
> >> On 12.11.24 5:46 PM, Fabian Grünbichler wrote:
> >>> On November 7, 2024 5:51 pm, Fiona Ebner wrote:
> >>>> +    backup_state.target_id = g_strdup("Proxmox");
> >>>
> >>> if we take this opportunity to also support multiple PBS targets while
> >>> we are at it, it might make sense to make this more of a "legacy" value?
> >>> or not set it at all here to opt into the legacy behaviour?
> >>>
> >>
> >> Why isn't "Proxmox" a good legacy value? When we add support for passing
> >> in a target ID to qmp_backup(), I had in mind using "PBS-$storeid" or
> >> something along those lines.
> > 
> > because it might clash with actual target IDs? that's why I thought that
> > maybe not setting it at all in that case provides more flexibility to
> > differentiate..
> 
> I don't like the special casing that would entail in the C code. Having
> it always set regardless of legacy or not is nicer.
> 
> How about we fix this on the qemu-server side by passing
> "snapshot-access:$storeid" and, in the future, "pbs:$storeid", as
> "target-id" values to QMP?

that sounds like an okay approach as well! :)


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2024-11-13 12:04 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-07 16:51 [pve-devel] [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 01/34] block/reqlist: allow adding overlapping requests Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 02/34] PVE backup: fixup error handling for fleecing Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 03/34] PVE backup: factor out setting up snapshot access " Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 04/34] PVE backup: save device name in device info structure Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [PATCH qemu v3 05/34] PVE backup: include device name in error when setting up snapshot access fails Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 06/34] PVE backup: add target ID in backup state Fiona Ebner
2024-11-12 16:46   ` Fabian Grünbichler
2024-11-13  9:22     ` Fiona Ebner
2024-11-13  9:33       ` Fiona Ebner
2024-11-13 11:16       ` Fabian Grünbichler
2024-11-13 11:40         ` Fiona Ebner
2024-11-13 12:03           ` Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 07/34] PVE backup: get device info: allow caller to specify filter for which devices use fleecing Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 08/34] PVE backup: implement backup access setup and teardown API for external providers Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC qemu v3 09/34] PVE backup: implement bitmap support for external backup access Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC common v3 10/34] env: add module with helpers to run a Perl subroutine in a user namespace Fiona Ebner
2024-11-11 18:33   ` Thomas Lamprecht
2024-11-12 10:19     ` Fiona Ebner
2024-11-12 14:20   ` Fabian Grünbichler
2024-11-13 10:08     ` Fiona Ebner
2024-11-13 11:15       ` Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [RFC storage v3 11/34] add storage_has_feature() helper function Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC storage v3 12/34] plugin: introduce new_backup_provider() method Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC storage v3 13/34] extract backup config: delegate to backup provider for storages that support it Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [POC storage v3 14/34] add backup provider example Fiona Ebner
2024-11-13 10:52   ` Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [POC storage v3 15/34] WIP Borg plugin Fiona Ebner
2024-11-13 10:52   ` Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 16/34] move nbd_stop helper to QMPHelpers module Fiona Ebner
2024-11-11 13:55   ` [pve-devel] applied: " Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 17/34] backup: move cleanup of fleecing images to cleanup method Fiona Ebner
2024-11-12  9:26   ` [pve-devel] applied: " Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 18/34] backup: cleanup: check if VM is running before issuing QMP commands Fiona Ebner
2024-11-12  9:26   ` [pve-devel] applied: " Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 19/34] backup: keep track of block-node size for fleecing Fiona Ebner
2024-11-11 14:22   ` Fabian Grünbichler
2024-11-12  9:50     ` Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 20/34] backup: allow adding fleecing images also for EFI and TPM Fiona Ebner
2024-11-12  9:26   ` Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 21/34] backup: implement backup for external providers Fiona Ebner
2024-11-12 12:27   ` Fabian Grünbichler
2024-11-12 14:35     ` Fiona Ebner
2024-11-12 15:17       ` Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [PATCH qemu-server v3 22/34] restore: die early when there is no size for a device Fiona Ebner
2024-11-12  9:28   ` [pve-devel] applied: " Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 23/34] backup: implement restore for external providers Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC qemu-server v3 24/34] backup restore: external: hardening check for untrusted source image Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [PATCH container v3 25/34] create: add missing include of PVE::Storage::Plugin Fiona Ebner
2024-11-12 15:22   ` [pve-devel] applied: " Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [RFC container v3 26/34] backup: implement backup for external providers Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC container v3 27/34] create: factor out tar restore command helper Fiona Ebner
2024-11-12 16:28   ` Fabian Grünbichler
2024-11-12 17:08   ` [pve-devel] applied: " Thomas Lamprecht
2024-11-07 16:51 ` [pve-devel] [RFC container v3 28/34] backup: implement restore for external providers Fiona Ebner
2024-11-12 16:27   ` Fabian Grünbichler
2024-11-07 16:51 ` [pve-devel] [RFC container v3 29/34] external restore: don't use 'one-file-system' tar flag when restoring from a directory Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC container v3 30/34] create: factor out compression option helper Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC container v3 31/34] restore tar archive: check potentially untrusted archive Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC container v3 32/34] api: add early check against restoring privileged container from external source Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [PATCH manager v3 33/34] ui: backup: also check for backup subtype to classify archive Fiona Ebner
2024-11-07 16:51 ` [pve-devel] [RFC manager v3 34/34] backup: implement backup for external providers Fiona Ebner
2024-11-12 15:50 ` [pve-devel] partially-applied: [RFC qemu/common/storage/qemu-server/container/manager v3 00/34] backup provider API Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal