public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
@ 2024-07-26 19:47 Jonathan Nicklin via pve-devel
  2024-07-27 15:20 ` Dietmar Maurer
  2024-07-29  8:15 ` Fiona Ebner
  0 siblings, 2 replies; 11+ messages in thread
From: Jonathan Nicklin via pve-devel @ 2024-07-26 19:47 UTC (permalink / raw)
  To: pve-devel; +Cc: Jonathan Nicklin

[-- Attachment #1: Type: message/rfc822, Size: 5922 bytes --]

From: Jonathan Nicklin <jnicklin@blockbridge.com>
To: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23]  backup provider API
Date: Fri, 26 Jul 2024 15:47:12 -0400
Message-ID: <6B38485B-5520-4E28-824D-E50C699E96A1@blockbridge.com>

Hi Fiona,

Would adding support for offloading incremental difference detection
to the underlying storage be feasible with the API updates? The QEMU
bitmap strategy works for all storage devices but is far from
optimal. If backup coordinated a storage snapshot, the underlying
storage could enumerate the differences (or generate a bitmap).

This would allow PBS to connect directly to storage and retrieve
incremental differences, which could remove the PVE hosts from the
equation. This "storage-direct" approach for backup would improve
performance, reduce resources, and support incremental backups in all
cases (i.e., power failues, shutdowns, etc.). It would also eliminate
the dependency on QEMU bitmaps and the overhead of fleecing.

Theoretically, this should be possible with any shared storage that
can enumerate incremental differences between snapshots: Ceph,
Blockbridge, iSCSi/ZFS?

Thoughts?




[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
  2024-07-26 19:47 [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API Jonathan Nicklin via pve-devel
@ 2024-07-27 15:20 ` Dietmar Maurer
  2024-07-27 20:36   ` Jonathan Nicklin via pve-devel
       [not found]   ` <E6295C3B-9E33-47C2-BC0E-9CEC701A2716@blockbridge.com>
  2024-07-29  8:15 ` Fiona Ebner
  1 sibling, 2 replies; 11+ messages in thread
From: Dietmar Maurer @ 2024-07-27 15:20 UTC (permalink / raw)
  To: Proxmox VE development discussion; +Cc: Jonathan Nicklin

> Would adding support for offloading incremental difference detection
> to the underlying storage be feasible with the API updates? The QEMU
> bitmap strategy works for all storage devices but is far from
> optimal.

Sorry, but why do you think this is far from optimal?


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
  2024-07-27 15:20 ` Dietmar Maurer
@ 2024-07-27 20:36   ` Jonathan Nicklin via pve-devel
       [not found]   ` <E6295C3B-9E33-47C2-BC0E-9CEC701A2716@blockbridge.com>
  1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Nicklin via pve-devel @ 2024-07-27 20:36 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Jonathan Nicklin, Proxmox VE development discussion

[-- Attachment #1: Type: message/rfc822, Size: 6805 bytes --]

From: Jonathan Nicklin <jnicklin@blockbridge.com>
To: Dietmar Maurer <dietmar@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
Date: Sat, 27 Jul 2024 16:36:14 -0400
Message-ID: <E6295C3B-9E33-47C2-BC0E-9CEC701A2716@blockbridge.com>


> On Jul 27, 2024, at 11:20 AM, Dietmar Maurer <dietmar@proxmox.com> wrote:
> 
>> Would adding support for offloading incremental difference detection
>> to the underlying storage be feasible with the API updates? The QEMU
>> bitmap strategy works for all storage devices but is far from
>> optimal.
> 
> Sorry, but why do you think this is far from optimal?
> 

The biggest issue we see reported related to QEMU bitmaps is
persistence. The lack of durability results in unpredictable backup
behavior at scale. If a host, rack, or data center loses power, you're
in for a full backup cycle. Even if several VMs are powered off for
some reason, it can be a nuisance. Several storage solutions can
generate the incremental difference bitmaps from durable sources,
eliminating the issue.

That said, using bitmaps or alternative sources for the incremental
differences is slightly orthogonal to the end goal. The real
improvement we're hoping for is the ability to eliminate backup
traffic on the client.

Today, I believe the client is reading the data and pushing it to
PBS. In the case of CEPH, wouldn't this involve sourcing data from
multiple nodes and then sending it to PBS? Wouldn't it be more
efficient for PBS to read it directly from storage? In the case of
centralized storage, we'd like to eliminate the client load
completely, having PBS ingest increment differences directly from
storage without passing through the client.


[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
       [not found]   ` <E6295C3B-9E33-47C2-BC0E-9CEC701A2716@blockbridge.com>
@ 2024-07-28  6:46     ` Dietmar Maurer
  2024-07-28 13:54       ` Jonathan Nicklin via pve-devel
       [not found]       ` <1C86CC96-2C9C-466A-A2A9-FC95906C098E@blockbridge.com>
  2024-07-28  7:55     ` Dietmar Maurer
  1 sibling, 2 replies; 11+ messages in thread
From: Dietmar Maurer @ 2024-07-28  6:46 UTC (permalink / raw)
  To: Jonathan Nicklin; +Cc: Proxmox VE development discussion

> Today, I believe the client is reading the data and pushing it to
> PBS. In the case of CEPH, wouldn't this involve sourcing data from
> multiple nodes and then sending it to PBS? Wouldn't it be more
> efficient for PBS to read it directly from storage? In the case of
> centralized storage, we'd like to eliminate the client load
> completely, having PBS ingest increment differences directly from
> storage without passing through the client.

But Ceph is not a central storage. Instead, data is distributed among the nodes, so you always need to send some data over the network.
There is no way to "read it directly from storage".


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
       [not found]   ` <E6295C3B-9E33-47C2-BC0E-9CEC701A2716@blockbridge.com>
  2024-07-28  6:46     ` Dietmar Maurer
@ 2024-07-28  7:55     ` Dietmar Maurer
  2024-07-28 14:12       ` Jonathan Nicklin via pve-devel
  1 sibling, 1 reply; 11+ messages in thread
From: Dietmar Maurer @ 2024-07-28  7:55 UTC (permalink / raw)
  To: Jonathan Nicklin; +Cc: Proxmox VE development discussion


> The biggest issue we see reported related to QEMU bitmaps is
> persistence. The lack of durability results in unpredictable backup
> behavior at scale. If a host, rack, or data center loses power, you're
> in for a full backup cycle. Even if several VMs are powered off for
> some reason, it can be a nuisance. Several storage solutions can
> generate the incremental difference bitmaps from durable sources,
> eliminating the issue.

Several storage solutions provides internal snapshots, but none of them has an API to access the dirty bitmap (please correct me if I am wrong). Or what storage solution do you talk about exactly?

Storing the dirty bitmap persistently would be relatively easy, but so far we found no way to make sure the bitmap is always up-to-date. 
We support shared storages, so multiple nodes can access and modify the data without updating the dirty bitmap, which would lead to corrupt backup images...


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
  2024-07-28  6:46     ` Dietmar Maurer
@ 2024-07-28 13:54       ` Jonathan Nicklin via pve-devel
       [not found]       ` <1C86CC96-2C9C-466A-A2A9-FC95906C098E@blockbridge.com>
  1 sibling, 0 replies; 11+ messages in thread
From: Jonathan Nicklin via pve-devel @ 2024-07-28 13:54 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Jonathan Nicklin, Proxmox VE development discussion

[-- Attachment #1: Type: message/rfc822, Size: 6617 bytes --]

From: Jonathan Nicklin <jnicklin@blockbridge.com>
To: Dietmar Maurer <dietmar@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
Date: Sun, 28 Jul 2024 09:54:48 -0400
Message-ID: <1C86CC96-2C9C-466A-A2A9-FC95906C098E@blockbridge.com>

In hyper-converged deployments, the node performing the backup is sourcing ((nodes-1)/(nodes))*bytes) of backup data (i.e., ingress traffic) and then sending 1*bytes to PBS (i.e., egress traffic). If PBS were to pull the data from the nodes directly, the maximum load on any one host would be (1/nodes)*bytes of egress traffic only... that's a considerable improvement!

Further, nodes that don't host OSDs would be completely quiet. So, in the case of non-converged CEPH, the hypervisor nodes do not need to participate in the backup flow at all.

> On Jul 28, 2024, at 2:46 AM, Dietmar Maurer <dietmar@proxmox.com> wrote:
> 
>> Today, I believe the client is reading the data and pushing it to
>> PBS. In the case of CEPH, wouldn't this involve sourcing data from
>> multiple nodes and then sending it to PBS? Wouldn't it be more
>> efficient for PBS to read it directly from storage? In the case of
>> centralized storage, we'd like to eliminate the client load
>> completely, having PBS ingest increment differences directly from
>> storage without passing through the client.
> 
> But Ceph is not a central storage. Instead, data is distributed among the nodes, so you always need to send some data over the network.
> There is no way to "read it directly from storage".
> 



[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
  2024-07-28  7:55     ` Dietmar Maurer
@ 2024-07-28 14:12       ` Jonathan Nicklin via pve-devel
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Nicklin via pve-devel @ 2024-07-28 14:12 UTC (permalink / raw)
  To: Dietmar Maurer; +Cc: Jonathan Nicklin, Proxmox VE development discussion

[-- Attachment #1: Type: message/rfc822, Size: 7279 bytes --]

From: Jonathan Nicklin <jnicklin@blockbridge.com>
To: Dietmar Maurer <dietmar@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
Date: Sun, 28 Jul 2024 10:12:21 -0400
Message-ID: <CD15F8AE-29C2-441B-8874-62E3FB92C1F8@blockbridge.com>

I am by no means a CEPH expert. However, my understanding is that other backup solutions (in the OpenStack world) have used rbd diff to enable incremental backups. I was hoping that would be relevant here. 

Here's the description of `rbd diff`

<rbd-diff>
Dump a list of byte extents in the image that have changed since the specified start snapshot, or since the image was created. Each output line includes the starting offset (in bytes), the length of the region (in bytes), and either ‘zero’ or ‘data’ to indicate whether the region is known to be zeros or may contain other data.
</rbd-diff>

We (Blockbridge) can also enumerate differences between snapshots in the form of extent ranges.

We share the same concerns regarding the consistency of QEMU bitmaps wrt storage. That is why relying on the storage to track differences feels like a more robust solution. 

> On Jul 28, 2024, at 3:55 AM, Dietmar Maurer <dietmar@proxmox.com> wrote:
> 
> 
>> The biggest issue we see reported related to QEMU bitmaps is
>> persistence. The lack of durability results in unpredictable backup
>> behavior at scale. If a host, rack, or data center loses power, you're
>> in for a full backup cycle. Even if several VMs are powered off for
>> some reason, it can be a nuisance. Several storage solutions can
>> generate the incremental difference bitmaps from durable sources,
>> eliminating the issue.
> 
> Several storage solutions provides internal snapshots, but none of them has an API to access the dirty bitmap (please correct me if I am wrong). Or what storage solution do you talk about exactly?
> 
> Storing the dirty bitmap persistently would be relatively easy, but so far we found no way to make sure the bitmap is always up-to-date. 
> We support shared storages, so multiple nodes can access and modify the data without updating the dirty bitmap, which would lead to corrupt backup images...
> 



[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
       [not found]       ` <1C86CC96-2C9C-466A-A2A9-FC95906C098E@blockbridge.com>
@ 2024-07-28 14:58         ` Dietmar Maurer
  0 siblings, 0 replies; 11+ messages in thread
From: Dietmar Maurer @ 2024-07-28 14:58 UTC (permalink / raw)
  To: Jonathan Nicklin; +Cc: Proxmox VE development discussion

> In hyper-converged deployments, the node performing the backup is sourcing ((nodes-1)/(nodes))*bytes) of backup data (i.e., ingress traffic) and then sending 1*bytes to PBS (i.e., egress traffic). If PBS were to pull the data from the nodes directly, the maximum load on any one host would be (1/nodes)*bytes of egress traffic only... that's a considerable improvement!

I guess it would be possible to write a tool like proxmox-backup-client that pull ceph backups directly from PBS. Or extend the backup protokoll allowing direct storage access. But this is a considerable amount of development, and needs much more configuration/setup than the current approach. But patches are always welcome...

Also, it is not clear to me how we can implement a "backup provider API" if we add such optimizations?

And yes, network traffic would be reduced. But IMHO it is easier to add a dedicated network card for the backup server (if the network is the limiting factor). With this setup, the maximum load on the ceph network is (1/nodes)*bytess of egress traffic only. The backup traffic is on the dedicated backup net.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
  2024-07-26 19:47 [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API Jonathan Nicklin via pve-devel
  2024-07-27 15:20 ` Dietmar Maurer
@ 2024-07-29  8:15 ` Fiona Ebner
  2024-07-29 21:29   ` Jonathan Nicklin via pve-devel
  1 sibling, 1 reply; 11+ messages in thread
From: Fiona Ebner @ 2024-07-29  8:15 UTC (permalink / raw)
  To: Proxmox VE development discussion; +Cc: Jonathan Nicklin

Hi,

Am 26.07.24 um 21:47 schrieb Jonathan Nicklin via pve-devel:
> 
> Hi Fiona,
> 
> Would adding support for offloading incremental difference detection
> to the underlying storage be feasible with the API updates? The QEMU
> bitmap strategy works for all storage devices but is far from
> optimal. If backup coordinated a storage snapshot, the underlying
> storage could enumerate the differences (or generate a bitmap).
> 
> This would allow PBS to connect directly to storage and retrieve
> incremental differences, which could remove the PVE hosts from the
> equation. This "storage-direct" approach for backup would improve
> performance, reduce resources, and support incremental backups in all
> cases (i.e., power failues, shutdowns, etc.). It would also eliminate
> the dependency on QEMU bitmaps and the overhead of fleecing.
> 
> Theoretically, this should be possible with any shared storage that
> can enumerate incremental differences between snapshots: Ceph,
> Blockbridge, iSCSi/ZFS?
> 
> Thoughts?
> 

The two big advantages of the current mechanism are:

1. it's completely storage-agnostic, so you can even use it with raw
files on a directory storage. It follows in the same spirit as existing
backup. Prohibiting backup for users when they use certain kinds of
storages for VMs is not nice.
2. it's been battle-tested with PBS and works nicely.

I don't see why your suggestion can't be implemented in principle.
Feature requests for (non-incremental) "storage-snapshot" mode backup
have been around since a while. It was not a priority for development
yet and is totally different from the current "snapshot" backup mode, so
will need to be developed from the ground up.

That said, AFAICS, it's orthogonal to the series here. When an
implementation like you outlined exists, it can just be added as a new
backup mechanism for external providers (and PBS).

See also the related discussion over at:
https://bugzilla.proxmox.com/show_bug.cgi?id=3233#c19

Best Regards,
Fiona


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
  2024-07-29  8:15 ` Fiona Ebner
@ 2024-07-29 21:29   ` Jonathan Nicklin via pve-devel
  0 siblings, 0 replies; 11+ messages in thread
From: Jonathan Nicklin via pve-devel @ 2024-07-29 21:29 UTC (permalink / raw)
  To: Fiona Ebner; +Cc: Jonathan Nicklin, Proxmox VE development discussion

[-- Attachment #1: Type: message/rfc822, Size: 7821 bytes --]

From: Jonathan Nicklin <jnicklin@blockbridge.com>
To: Fiona Ebner <f.ebner@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
Date: Mon, 29 Jul 2024 17:29:34 -0400
Message-ID: <5041A106-1A29-459C-AEDC-8531524ACA18@blockbridge.com>

I 100% concur. I am not suggesting any breaking changes; I was just wondering if this work on the API unlocked any new optimizations to make the interactions between the backup client, PBS, and storage more efficient. And also, bbgeek has pinged me to check out the awesome work going on in this space :)

Between your and Dietmar's replies, I see the constraints and potential avenues for improvement. Thanks for your reply!

Respectfully,
-Jonathan

> On Jul 29, 2024, at 4:15 AM, Fiona Ebner <f.ebner@proxmox.com> wrote:
> 
> Hi,
> 
> Am 26.07.24 um 21:47 schrieb Jonathan Nicklin via pve-devel:
>> 
>> Hi Fiona,
>> 
>> Would adding support for offloading incremental difference detection
>> to the underlying storage be feasible with the API updates? The QEMU
>> bitmap strategy works for all storage devices but is far from
>> optimal. If backup coordinated a storage snapshot, the underlying
>> storage could enumerate the differences (or generate a bitmap).
>> 
>> This would allow PBS to connect directly to storage and retrieve
>> incremental differences, which could remove the PVE hosts from the
>> equation. This "storage-direct" approach for backup would improve
>> performance, reduce resources, and support incremental backups in all
>> cases (i.e., power failues, shutdowns, etc.). It would also eliminate
>> the dependency on QEMU bitmaps and the overhead of fleecing.
>> 
>> Theoretically, this should be possible with any shared storage that
>> can enumerate incremental differences between snapshots: Ceph,
>> Blockbridge, iSCSi/ZFS?
>> 
>> Thoughts?
>> 
> 
> The two big advantages of the current mechanism are:
> 
> 1. it's completely storage-agnostic, so you can even use it with raw
> files on a directory storage. It follows in the same spirit as existing
> backup. Prohibiting backup for users when they use certain kinds of
> storages for VMs is not nice.
> 2. it's been battle-tested with PBS and works nicely.
> 
> I don't see why your suggestion can't be implemented in principle.
> Feature requests for (non-incremental) "storage-snapshot" mode backup
> have been around since a while. It was not a priority for development
> yet and is totally different from the current "snapshot" backup mode, so
> will need to be developed from the ground up.
> 
> That said, AFAICS, it's orthogonal to the series here. When an
> implementation like you outlined exists, it can just be added as a new
> backup mechanism for external providers (and PBS).
> 
> See also the related discussion over at:
> https://bugzilla.proxmox.com/show_bug.cgi?id=3233#c19
> 
> Best Regards,
> Fiona
> 



[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API
@ 2024-07-23  9:56 Fiona Ebner
  0 siblings, 0 replies; 11+ messages in thread
From: Fiona Ebner @ 2024-07-23  9:56 UTC (permalink / raw)
  To: pve-devel

======

A backup provider needs to implement a storage plugin as well as a
backup provider plugin. The storage plugin is for integration in
Proxmox VE's front-end, so users can manage the backups via
UI/API/CLI. The backup provider plugin is for interfacing with the
backup provider's backend to integrate backup and restore with that
backend into Proxmox VE.

This is an initial draft of an API and required changes to the backup
stack in Proxmox VE to make it work. Depending on feedback from other
developers and interested parties, it can still substantially change.

======

The backup provider API is split into two parts, both of which again
need different implementations for VM and LXC guests:

1. Backup API

There hook callbacks for the start/end/abort phases of guest backups
as well as for start/end/abort phases of a whole backup job.

The backup_get_mechanism() method is used to decide on the backup
mechanism. Currently only 'nbd' for VMs and 'directory' for containers
are possible. It also let's the plugin decide whether to use a bitmap
for incremental VM backup or not.

Next, there are methods for backing up guest and firewall
configuration as well as for the backup mechanisms:

- a container filesystem using a provided directory. The directory
  contains an unchanging copy of the container's file system.

- a VM disk using a provided NBD export. The export is an unchanging
  copy of the VM's disk. Either the full image, or in case a bitmap is
  used, the dirty parts of the image since the last time the bitmap
  was used for a successful backup. Reading outside of the dirty parts
  will result in an error. After backing up each part of the disk, it
  should be discarded in the export to avoid unnecessary space usage
  on the Proxmox VE side (there is an associated fleecing image).

Finally, some helpers like getting the provider name or volume ID for
the backup target, as well as for handling the backup log.

2. Restore API

The restore_get_mechanism() method is used to decide on the restore
mechanism. Currently, only 'qemu-img' for VMs and 'directory' and
'tar' for containers are possible.

Next, methods for extracting the guest and firewall configuration and
the implementations of the restore mechanism. It is enough to
implement one restore mechanism per guest type of course:

- for VMs, with the 'qemu-img' mechanism, the backup provider gives a
  path to the disk image that will be restore. The path should be
  something qemu-img can deal with, e.g. can also be an NBD URI.

- for containers, with the 'directory' mechanism, the backup provider
  gives the path to a directory with the full filesystem structure of
  the container.

- for containers, with the 'tar' mechanism, the backup provider gives
  the path to a (potentially compressed) tar archive with the full
  filesystem structure of the container.

For VMs, there also is a restore_qemu_get_device_info() helper
required, to get the disks included in the backup and their sizes.

See the PVE::BackupProvider::Plugin module for the full API
documentation.

======

This series adapts the backup stack in Proxmox VE to allow using the
above API. For QEMU, backup access setup and teardown QMP commands are
implemented to be able to provide access to a consistent disk state to
the backup provider.

The series also provides an example implementation for a backup
provider as a proof-of-concept, exposing the different features.

======

Open questions:

Should the backup provider plugin system also follow the same API
age+version schema with a Custom/ directory for external plugins
derived from the base plugin?

Should there also be hook callbacks (i.e. start/end/abort) for
restore?

Should the bitmap action be passed directly to the backup provider?
I.e. have 'not-used', 'not-used-removed', 'new', 'used', 'invalid',
instead of only 'none', 'new' and 'reuse'. It makes API slightly more
complicated. Is there any situation where backup provider could care
if bitmap is new, because it was the first or bitmap is new because
previous was invalid? Both cases require the backup provider to do a
full backup.

======

The patches marked as PATCH rather than RFC can make sense
independently, with QEMU patches 02 and 03 having been sent already
before (touching same code, so included here):

https://lore.proxmox.com/pve-devel/20240625133551.210636-1-f.ebner@proxmox.com/#r

======

Feedback is very welcome, especially from people wishing to implement
such a backup provider plugin! Please tell me what issues you see with
the proposed API, what would and wouldn't work from your perspective?

======

Dependencies: pve-manager, pve-container and qemu-server all depend on
new libpve-storage-perl. pve-manager also build-depends on the new
libpve-storage-perl for its tests. To keep things clean, pve-manager
should also depend on new pve-container and qemu-server.

In qemu-server, there is no version guard added yet, as that depends
on the QEMU version the feature will land in.

======


qemu:

Fiona Ebner (9):
  block/reqlist: allow adding overlapping requests
  PVE backup: fixup error handling for fleecing
  PVE backup: factor out setting up snapshot access for fleecing
  PVE backup: save device name in device info structure
  PVE backup: include device name in error when setting up snapshot
    access fails
  PVE backup: add target ID in backup state
  PVE backup: get device info: allow caller to specify filter for which
    devices use fleecing
  PVE backup: implement backup access setup and teardown API for
    external providers
  PVE backup: implement bitmap support for external backup access

 block/copy-before-write.c |   3 +-
 block/reqlist.c           |   2 -
 pve-backup.c              | 619 +++++++++++++++++++++++++++++++++-----
 pve-backup.h              |  16 +
 qapi/block-core.json      |  58 ++++
 system/runstate.c         |   6 +
 6 files changed, 633 insertions(+), 71 deletions(-)
 create mode 100644 pve-backup.h


storage:

Fiona Ebner (3):
  plugin: introduce new_backup_provider() method
  extract backup config: delegate to backup provider if there is one
  add backup provider example

 src/PVE/BackupProvider/DirectoryExample.pm    | 533 ++++++++++++++++++
 src/PVE/BackupProvider/Makefile               |   6 +
 src/PVE/BackupProvider/Plugin.pm              | 343 +++++++++++
 src/PVE/Makefile                              |   1 +
 src/PVE/Storage.pm                            |  22 +-
 .../Custom/BackupProviderDirExamplePlugin.pm  | 289 ++++++++++
 src/PVE/Storage/Custom/Makefile               |   5 +
 src/PVE/Storage/Makefile                      |   1 +
 src/PVE/Storage/Plugin.pm                     |  15 +
 9 files changed, 1213 insertions(+), 2 deletions(-)
 create mode 100644 src/PVE/BackupProvider/DirectoryExample.pm
 create mode 100644 src/PVE/BackupProvider/Makefile
 create mode 100644 src/PVE/BackupProvider/Plugin.pm
 create mode 100644 src/PVE/Storage/Custom/BackupProviderDirExamplePlugin.pm
 create mode 100644 src/PVE/Storage/Custom/Makefile


qemu-server:

Fiona Ebner (7):
  move nbd_stop helper to QMPHelpers module
  backup: move cleanup of fleecing images to cleanup method
  backup: cleanup: check if VM is running before issuing QMP commands
  backup: allow adding fleecing images also for EFI and TPM
  backup: implement backup for external providers
  restore: die early when there is no size for a device
  backup: implement restore for external providers

 PVE/API2/Qemu.pm             |  33 +++++-
 PVE/CLI/qm.pm                |   3 +-
 PVE/QemuServer.pm            | 139 +++++++++++++++++++++-
 PVE/QemuServer/QMPHelpers.pm |   6 +
 PVE/VZDump/QemuServer.pm     | 218 +++++++++++++++++++++++++++++++----
 5 files changed, 365 insertions(+), 34 deletions(-)


container:

Fiona Ebner (2):
  backup: implement backup for external providers
  backup: implement restore for external providers

 src/PVE/LXC/Create.pm | 143 ++++++++++++++++++++++++++++++++++++++++++
 src/PVE/VZDump/LXC.pm |  20 +++++-
 2 files changed, 162 insertions(+), 1 deletion(-)


manager:

Fiona Ebner (2):
  ui: backup: also check for backup subtype to classify archive
  backup: implement backup for external providers

 PVE/VZDump.pm                      | 43 +++++++++++++++++++++++++-----
 test/vzdump_new_test.pl            |  3 +++
 www/manager6/Utils.js              | 10 ++++---
 www/manager6/grid/BackupView.js    |  4 +--
 www/manager6/storage/BackupView.js |  4 +--
 5 files changed, 50 insertions(+), 14 deletions(-)


Summary over all repositories:
  27 files changed, 2423 insertions(+), 122 deletions(-)

-- 
Generated by git-murpp 0.5.0


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-07-30  6:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-26 19:47 [pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API Jonathan Nicklin via pve-devel
2024-07-27 15:20 ` Dietmar Maurer
2024-07-27 20:36   ` Jonathan Nicklin via pve-devel
     [not found]   ` <E6295C3B-9E33-47C2-BC0E-9CEC701A2716@blockbridge.com>
2024-07-28  6:46     ` Dietmar Maurer
2024-07-28 13:54       ` Jonathan Nicklin via pve-devel
     [not found]       ` <1C86CC96-2C9C-466A-A2A9-FC95906C098E@blockbridge.com>
2024-07-28 14:58         ` Dietmar Maurer
2024-07-28  7:55     ` Dietmar Maurer
2024-07-28 14:12       ` Jonathan Nicklin via pve-devel
2024-07-29  8:15 ` Fiona Ebner
2024-07-29 21:29   ` Jonathan Nicklin via pve-devel
  -- strict thread matches above, loose matches on Subject: below --
2024-07-23  9:56 Fiona Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal