From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 3355090DA5
 for <pve-devel@lists.proxmox.com>; Thu, 25 Jan 2024 15:41:55 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 16C951B950
 for <pve-devel@lists.proxmox.com>; Thu, 25 Jan 2024 15:41:55 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pve-devel@lists.proxmox.com>; Thu, 25 Jan 2024 15:41:54 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id C728E492BF
 for <pve-devel@lists.proxmox.com>; Thu, 25 Jan 2024 15:41:53 +0100 (CET)
From: Fiona Ebner <f.ebner@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Thu, 25 Jan 2024 15:41:36 +0100
Message-Id: <20240125144149.216064-1-f.ebner@proxmox.com>
X-Mailer: git-send-email 2.39.2
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.075 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [common.pm, qemuserver.pm, vzdump.pm]
Subject: [pve-devel] [RFC qemu/guest-common/manager/qemu-server/docs 00/13]
 fix #4136: implement backup fleecing
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 25 Jan 2024 14:41:55 -0000

When a backup for a VM is started, QEMU will install a
"copy-before-write" filter in its block layer. This filter ensures
that upon new guest writes, old data still needed for the backup is
sent to the backup target first. The guest write blocks until this
operation is finished so guest IO to not-yet-backed-up sectors will be
limited by the speed of the backup target.

With backup fleecing, such old data is cached in a fleecing image
rather than sent directly to the backup target. This can help guest IO
performance and even prevent hangs in certain scenarios, at the cost
of requiring more storage space.

With this series it will be possible to enable backup-fleecing via
e.g. `vzdump 123 --fleecing enabled=1,storage=local-zfs` with fleecing
images created on the storage `local-zfs`. If no storage is specified,
the fleecing image will be created on the same storage as the original
image.


Fleecing images are created by qemu-server via pve-storage and
attached to QEMU before the backup starts, and cleaned up after the
backup finished or failed. Currently, just a "-fleecing(.raw)" suffix
is added and there is no special handling yet for e.g. qm rescan/etc..
And previous left-overs are not automatically cleaned up, because
while unlikely, images with this name might've been created by a user
too. Happy to discuss alternatives!

The fleecing image needs to be the exact same size as the source, but
luckily, an explicit size can be specified when attaching a raw image
to QEMU so there are no size issues when using storages that have
coarser allocation/round up.


While initial tests seem fine, bitmap handling needs to be carefully
checked for correctness. More eyeballs can't hurt there.

QEMU patches are for the submodule for better reviewability. There are
unfortunately a few prerequisites which are also still being worked on
upstream. These are:

Fix for qcow2 block status querying when used as a source image [0].
Already reviewed and being pulled.

For being able to discard the fleecing image, addition of a
discard-source parameter[1]. This series was adapted for downstream
and I tried to address the two remaining issues:

1. Permission issue when backup source node is read-only (e.g. TMP
state): Made permissions conditional for when discard-source is set
with a new option for the copy-before-write block driver. Currently,
it's part of QAPI, nicer would be to make it internal-only.

2. Cluster size issue when fleecing image has a larger cluster size
than backup target: Made a workaround by also considering source image
when calculating cluster size for block copy and had to hack
.bdrv_co_get_info implementations for snapshot-access and
copy-before-write. Not super confident and better to wait for an
answer from upstream.

Upstream reports/discussions for these can also be found at [1].


No hard dependencies AFAICS, but of course pve-manager should depend
on both new pve-guest-common and qemu-server to actually be able to
use the option.


[0]: https://lore.kernel.org/qemu-devel/20240116154839.401030-1-f.ebner@proxmox.com/
[1]: https://lore.kernel.org/qemu-devel/20240117160737.1057513-1-vsementsov@yandex-team.ru/

qemu:

Fiona Ebner (6):
  backup: factor out gathering device info into helper
  backup: get device info: code cleanup
  block/io: clear BDRV_BLOCK_RECURSE flag after recursing in
    bdrv_co_block_status
  block/{copy-before-write,snapshot-access}: implement bdrv_co_get_info
    driver callback
  block/block-copy: always consider source cluster size too
  PVE backup: add fleecing option

Vladimir Sementsov-Ogievskiy (2):
  block/copy-before-write: create block_copy bitmap in filter node
  qapi: blockdev-backup: add discard-source parameter

 block/backup.c                         |  15 +-
 block/block-copy.c                     |  36 ++--
 block/copy-before-write.c              |  46 ++++-
 block/copy-before-write.h              |   1 +
 block/io.c                             |  10 ++
 block/monitor/block-hmp-cmds.c         |   1 +
 block/replication.c                    |   4 +-
 block/snapshot-access.c                |   7 +
 blockdev.c                             |   2 +-
 include/block/block-copy.h             |   3 +-
 include/block/block_int-global-state.h |   2 +-
 pve-backup.c                           | 234 +++++++++++++++++++------
 qapi/block-core.json                   |  18 +-
 13 files changed, 300 insertions(+), 79 deletions(-)


guest-common:

Fiona Ebner (1):
  vzdump: schema: add fleecing property string

 src/PVE/VZDump/Common.pm | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)


manager:

Fiona Ebner (1):
  vzdump: handle new 'fleecing' property string

 PVE/VZDump.pm | 12 ++++++++++++
 1 file changed, 12 insertions(+)


qemu-server:

Fiona Ebner (2):
  backup: disk info: also keep track of size
  backup: implement fleecing option

 PVE/VZDump/QemuServer.pm | 141 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 139 insertions(+), 2 deletions(-)


docs:

Fiona Ebner (1):
  vzdump: add section about backup fleecing

 vzdump.adoc | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)


Summary over all repositories:
  17 files changed, 504 insertions(+), 0 deletions(-)

-- 
Generated by git-murpp 0.5.0