From: "DERUMIER, Alexandre via pve-devel" <pve-devel@lists.proxmox.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>,
"f.gruenbichler@proxmox.com" <f.gruenbichler@proxmox.com>
Cc: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
Subject: Re: [pve-devel] [PATCH v3 qemu-server 11/11] qcow2: add external snapshot support
Date: Mon, 13 Jan 2025 10:08:27 +0000 [thread overview]
Message-ID: <mailman.242.1736762947.441.pve-devel@lists.proxmox.com> (raw)
In-Reply-To: <570013533.830.1736423850641@webmail.proxmox.com>
[-- Attachment #1: Type: message/rfc822, Size: 24364 bytes --]
From: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>
To: "pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>, "f.gruenbichler@proxmox.com" <f.gruenbichler@proxmox.com>
Subject: Re: [pve-devel] [PATCH v3 qemu-server 11/11] qcow2: add external snapshot support
Date: Mon, 13 Jan 2025 10:08:27 +0000
Message-ID: <0ae72889042e006d9202e837aac7ecf2b413e1b4.camel@groupe-cyllene.com>
> Alexandre Derumier via pve-devel <pve-devel@lists.proxmox.com> hat am
> 16.12.2024 10:12 CET geschrieben:
>>it would be great if there'd be a summary of the design choices and a
>>high level summary of what happens to the files and block-node-graph
>>here. it's a bit hard to judge from the code below whether it would
>>be possible to eliminate the dynamically named block nodes, for
>>example ;)
yes, sorry, I'll add more infos with qemu limitations and why I'm doing
it like this.
>>a few more comments documenting the behaviour and ideally also some
>>tests (mocking the QMP interactions?) would be nice
yes, I'll add tests, need to find a good way to mock it.
> +
> + #preallocate add a new current file
> + my $new_current_fmt_nodename = get_blockdev_nextid("fmt-
> $deviceid", $nodes);
> + my $new_current_file_nodename = get_blockdev_nextid("file-
> $deviceid", $nodes);
>>okay, so here we have a dynamic node name because the desired target
>>name is still occupied. could we rename the old block node first?
we can't rename a node, that's the problem.
> + PVE::Storage::volume_snapshot($storecfg, $volid, $snap);
>>(continued from above) and this invocation here are the same??
The invocation is the same, but they it's not doing the same if it's an
external snasphot.
>> wouldn't this already create the snapshot on the storage layer?
yes, it's create the (lvm volume) + qcow2 file with preallocation
>>and didn't we just hardlink + reopen + unlink to transform the
>>previous current volume into the snap volume?
yes.
here, we are creating the new current volume, adding it to the graph
with blockdev-add, then finally switch to it with blockdev-snapshot
The ugly trick in pve-storage is in plugin.pm
#rename current volume to snap volume
rename($path, $snappath) if -e $path && !-e $snappath;
or in lvm plugin.
eval { lvrename($vg, $volname, $snap_volname) } ;
(and you have already made comment about them ;)
because I'm re-using volume_snapshot (I didn't have to add a new method
in pve-storage) to allocate the snasphot file, but indeed, we have
already to the rename online.
>>should this maybe have been vdisk_alloc and it just works by
accident?
It's not works fine with vdisk_alloc, because the volume need to be
created without the size specified but with backing file param instead.
(if I remember, qemu-img is looking at the backing file size+metadas
and set the correct total size for the new volume)
Maybe a better way could be to reuse vdisk_alloc, and add backing file
as param ?
> + my $new_file_blockdev = generate_file_blockdev($storecfg,
> $drive, $new_current_file_nodename);
> + my $new_fmt_blockdev = generate_format_blockdev($storecfg,
> $drive, $new_current_fmt_nodename, $new_file_blockdev);
> +
> + $new_fmt_blockdev->{backing} = undef;
>>generate_format_blockdev doesn't set backing?
yes, it's adding backing
>>maybe this should be >>converted into an assertion?
but they are a limitation of the qmp blockdev-ad ++blockdev-snapshot
where the backing attribute need undef in the blockdev-add or the
blockdev-snapshot will fail because it's trying itself to set the
backing file when doing the switch.
From my test, it was related to this
https://lists.gnu.org/archive/html/qemu-block/2019-10/msg01404.html
> + PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add',
> %$new_fmt_blockdev);
> + mon_cmd($vmid, 'blockdev-snapshot', node => $format_nodename,
> overlay => $new_current_fmt_nodename);
> +}
> +
> +sub blockdev_snap_rename {
> + my ($storecfg, $vmid, $deviceid, $drive, $src_path,
> $target_path) = @_;
>>I think this whole thing needs more error handling and thought about
>>how to recover from various points failing..
yes, that's the problem with renaming, it's not atomic.
Also, if we need to recover (rollback), how to manage multiple disk ?
>>there's also quite some overlap with blockdev_current_rename, I
>>wonder whether it would be possible to simplify the code further by
>merging the two? but see below, I think we can even get away with
>>dropping this altogether if we switch from block-commit to block-
>>stream..
Yes, I have seperated them because I was not sure of the different
workflow, and It was more simplier to fix one method without breaking
the other.
I'll look to implement block-stream. (and keep commit to initial image
for the last snapshot delete)
> + #untaint
> + if ($src_path =~ m/^(\S+)$/) {
> + $src_path = $1;
> + }
> + if ($target_path =~ m/^(\S+)$/) {
> + $target_path = $1;
> + }
>>shouldn't that have happened in the storage plugin?
>>
> +
> + #create a hardlink
> + link($src_path, $target_path);
>>should this maybe be done by the storage plugin?
This was to avoid to introduce a sub method, but yes, it could be
better indeed.
PVE::Storage::link ?
>
> + #delete old $path link
> + unlink($src_path);
and this
PVE::Storage::unlink ?
(can't use free_image here, because we really want to remove the link
and not the volume )
> +
> + #rename underlay
> + my $storage_name = PVE::Storage::parse_volume_id($volid);
> + my $scfg = $storecfg->{ids}->{$storage_name};
> + if ($scfg->{type} eq 'lvm') {
> + print"lvrename $src_path to $target_path\n";
> + run_command(
> + ['/sbin/lvrename', $src_path, $target_path],
> + errmsg => "lvrename $src_path to $target_path error",
> + );
> + }
>>and this as well?
I didn't reuse lvrename in lvmplugin, because it's using vgname/lvname
and not the path, but I can look to extend it)
> +}
> +
> +sub blockdev_current_rename {
> + my ($storecfg, $vmid, $deviceid, $drive, $path, $target_path,
> $skip_underlay) = @_;
> + ## rename current running image
> +
> + my $nodes = get_blockdev_nodes($vmid);
> + my $volid = $drive->{file};
> + my $target_file_nodename = get_blockdev_nextid("file-$deviceid",
> $nodes);
>>here we could already incorporate the snapshot name, since we know
it?
31char limits.
> +
> + my $file_blockdev = generate_file_blockdev($storecfg, $drive,
> $target_file_nodename);
> + $file_blockdev->{filename} = $target_path;
> +
> + my $format_node = find_blockdev_node($nodes, $path, 'fmt');
>>then we'd know this is always the "current" node, however we
>>deterministically name it?
until you are doing a block-mirror, the current fmt node will be
replaced with another current2 fmt node.
>>and this should be done by the storage layer I think? how does this
>>interact with LVM?
from my test, an hardlink is working
>> would we maybe want to mknod instead of hardlinking the
device node?
because /dev/<vgname>/<lv> is not a device node, it's already a link
to the device node
for example:
lrwxrwxrwx 1 root root 7 Dec 10 00:11 vm-10001-disk-0 -> ../dm-9
#ln vm-10001-disk-0 testrename
lrwxrwxrwx 1 root root 7 Dec 10 00:11 vm-10001-disk-0 -> ../dm-9
lrwxrwxrwx 1 root root 7 Dec 10 00:11 testrename -> ../dm-9
>>did you try whether a plain rename would also work (not sure - qemu
>>already has an open FD to the file/blockdev, but I am not sure how
>>LVM handles this ;))?
from my test, the lvrename, it simply the rename the lvm volume
internaly, then rename link.. and as we have already create the link,
it's simply rename it without problem.
#lvrename vm-10001-disk-0 vm-10001-disk-snap1
lrwxrwxrwx 1 root root 7 Dec 10 00:11 vm-10001-disk-snap1 -> ../dm-
9
lrwxrwxrwx 1 root root 7 Dec 10 00:11 testrename -> ../dm-9
#lvrename vm-10001-disk-snap1 testrename
lrwxrwxrwx 1 root root 7 Dec 10 00:11 testrename -> ../dm-9
>
> +
> +sub blockdev_commit {
>>see comments below for qemu_volume_snapshot_delete, I think this..
>>and this can be replaced altogether with blockdev_stream..
>>wouldn't it make more sense to use block-stream to merge the contents
>>of the to-be-deleted snapshot into the current overlay? that way we
>>wouldn't need to rename anything, AFAICT..
>>same here, instead of commiting from the child into the to-be-deleted
>>snapshot, and then renaming, why not just block-stream from the to-
>>be-deleted snapshot into the child, and then discard the snapshot
>>that is no longer needed?
>>commit is the wrong direction though?
>>
>>if we have A -> B -> C, and B is deleted, the delta previously
co>>ntained in B should be merged into C, not into A?
>>
>>so IMHO a simple block-stream + removal of the to-be-deleted snapshot
>>should be the right choice here as well?
>>
>>that would effectively make all the paths identical AFAICT (stream
>>from to-be-deleted snapshot to child, followed by deletion of the no
>>longer used volume corresponding to the deleted/streamed snapshot)
>>and no longer require any renaming..
Yes, got it now. I'll implement block-stream.
But keep commit for last snapshot delete.
[-- Attachment #2: Type: text/plain, Size: 160 bytes --]
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-01-13 10:09 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20241216091229.3142660-1-alexandre.derumier@groupe-cyllene.com>
2024-12-16 9:12 ` [pve-devel] [PATCH v1 pve-qemu 1/1] add block-commit-replaces option patch Alexandre Derumier via pve-devel
2025-01-08 13:27 ` Fabian Grünbichler
2025-01-10 7:55 ` DERUMIER, Alexandre via pve-devel
[not found] ` <34a164520eba035d1db5f70761b0f4aa59fecfa5.camel@groupe-cyllene.com>
2025-01-10 9:15 ` Fiona Ebner
2025-01-10 9:32 ` DERUMIER, Alexandre via pve-devel
[not found] ` <1e45e756801843dd46eb6ce2958d30885ad73bc2.camel@groupe-cyllene.com>
2025-01-13 14:28 ` Fiona Ebner
2025-01-14 10:10 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 01/11] blockdev: cmdline: convert drive to blockdev syntax Alexandre Derumier via pve-devel
2025-01-08 14:17 ` Fabian Grünbichler
2025-01-10 13:50 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 pve-storage 1/3] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-01-09 12:36 ` Fabian Grünbichler
2025-01-10 9:10 ` DERUMIER, Alexandre via pve-devel
[not found] ` <f25028d41a9588e82889b3ef869a14f33cbd216e.camel@groupe-cyllene.com>
2025-01-10 11:02 ` Fabian Grünbichler
2025-01-10 11:51 ` DERUMIER, Alexandre via pve-devel
[not found] ` <1caecaa23e5d57030a9e31f2f0e67648f1930d6a.camel@groupe-cyllene.com>
2025-01-10 12:20 ` Fabian Grünbichler
2025-01-10 13:14 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 02/11] blockdev: fix cfg2cmd tests Alexandre Derumier via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 pve-storage 2/3] lvmplugin: add qcow2 snapshot Alexandre Derumier via pve-devel
2025-01-09 13:55 ` Fabian Grünbichler
2025-01-10 10:16 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 03/11] blockdev : convert qemu_driveadd && qemu_drivedel Alexandre Derumier via pve-devel
2025-01-08 14:26 ` Fabian Grünbichler
2025-01-10 14:08 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 pve-storage 3/3] storage: vdisk_free: remove external snapshots Alexandre Derumier via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 04/11] blockdev: vm_devices_list : fix block-query Alexandre Derumier via pve-devel
2025-01-08 14:31 ` Fabian Grünbichler
2025-01-13 7:56 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 05/11] blockdev: convert cdrom media eject/insert Alexandre Derumier via pve-devel
2025-01-08 14:34 ` Fabian Grünbichler
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 06/11] blockdev: block_resize: convert to blockdev Alexandre Derumier via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 07/11] blockdev: nbd_export: block-export-add : use drive-$id for nodename Alexandre Derumier via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 08/11] blockdev: convert drive_mirror to blockdev_mirror Alexandre Derumier via pve-devel
2025-01-08 15:19 ` Fabian Grünbichler
2025-01-13 8:27 ` DERUMIER, Alexandre via pve-devel
[not found] ` <0d0d4c4d73110cf0e692cae0ee65bf7f9a6ce93a.camel@groupe-cyllene.com>
2025-01-13 9:52 ` Fabian Grünbichler
2025-01-13 9:55 ` Fabian Grünbichler
2025-01-13 10:47 ` DERUMIER, Alexandre via pve-devel
2025-01-13 13:42 ` Fiona Ebner
2025-01-14 10:03 ` DERUMIER, Alexandre via pve-devel
[not found] ` <fa38efbd95b57ba57a5628d6acfcda9d5875fa82.camel@groupe-cyllene.com>
2025-01-15 9:39 ` Fiona Ebner
2025-01-15 9:51 ` Fabian Grünbichler
2025-01-15 10:06 ` Fiona Ebner
2025-01-15 10:15 ` Fabian Grünbichler
2025-01-15 10:46 ` Fiona Ebner
2025-01-15 10:50 ` Fabian Grünbichler
2025-01-15 11:01 ` Fiona Ebner
2025-01-15 13:01 ` DERUMIER, Alexandre via pve-devel
2025-01-16 14:56 ` DERUMIER, Alexandre via pve-devel
2025-01-15 10:15 ` DERUMIER, Alexandre via pve-devel
[not found] ` <c1559499319052d6cf10900efd5376c12389a60f.camel@groupe-cyllene.com>
2025-01-13 13:31 ` Fabian Grünbichler
2025-01-20 13:37 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 09/11] blockdev: mirror: change aio on target if io_uring is not default Alexandre Derumier via pve-devel
2025-01-09 9:51 ` Fabian Grünbichler
2025-01-13 8:38 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 10/11] blockdev: add backing_chain support Alexandre Derumier via pve-devel
2025-01-09 11:57 ` Fabian Grünbichler
2025-01-13 8:53 ` DERUMIER, Alexandre via pve-devel
2024-12-16 9:12 ` [pve-devel] [PATCH v3 qemu-server 11/11] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-01-09 11:57 ` Fabian Grünbichler
2025-01-09 13:19 ` Fabio Fantoni via pve-devel
2025-01-20 13:44 ` DERUMIER, Alexandre via pve-devel
[not found] ` <3307ec388a763510ec78f97ed9f0de00c87d54b5.camel@groupe-cyllene.com>
2025-01-20 14:29 ` Fabio Fantoni via pve-devel
[not found] ` <6bdfe757-ae04-42e1-b197-c9ddb873e353@m2r.biz>
2025-01-20 14:41 ` DERUMIER, Alexandre via pve-devel
2025-01-13 10:08 ` DERUMIER, Alexandre via pve-devel [this message]
[not found] ` <0ae72889042e006d9202e837aac7ecf2b413e1b4.camel@groupe-cyllene.com>
2025-01-13 13:27 ` Fabian Grünbichler
2025-01-13 18:07 ` DERUMIER, Alexandre via pve-devel
2025-01-13 18:58 ` DERUMIER, Alexandre via pve-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mailman.242.1736762947.441.pve-devel@lists.proxmox.com \
--to=pve-devel@lists.proxmox.com \
--cc=alexandre.derumier@groupe-cyllene.com \
--cc=f.gruenbichler@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal