public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>,
	"pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH v3 qemu-server 11/11] qcow2: add external snapshot support
Date: Mon, 13 Jan 2025 14:27:03 +0100 (CET)	[thread overview]
Message-ID: <601608293.2958.1736774823822@webmail.proxmox.com> (raw)
In-Reply-To: <0ae72889042e006d9202e837aac7ecf2b413e1b4.camel@groupe-cyllene.com>


> DERUMIER, Alexandre <alexandre.derumier@groupe-cyllene.com> hat am 13.01.2025 11:08 CET geschrieben:
> 
>  
> > Alexandre Derumier via pve-devel <pve-devel@lists.proxmox.com> hat am
> > 16.12.2024 10:12 CET geschrieben:
> 
> >>it would be great if there'd be a summary of the design choices and a
> >>high level summary of what happens to the files and block-node-graph
> >>here. it's a bit hard to judge from the code below whether it would
> >>be possible to eliminate the dynamically named block nodes, for
> >>example ;)
> 
> yes, sorry, I'll add more infos with qemu limitations and why I'm doing
> it like this.
> 
> >>a few more comments documenting the behaviour and ideally also some
> >>tests (mocking the QMP interactions?) would be nice
> yes, I'll add tests, need to find a good way to mock it.
> 
> 
> > +
> > +    #preallocate add a new current file
> > +    my $new_current_fmt_nodename = get_blockdev_nextid("fmt-
> > $deviceid", $nodes);
> > +    my $new_current_file_nodename = get_blockdev_nextid("file-
> > $deviceid", $nodes);
> 
> >>okay, so here we have a dynamic node name because the desired target
> >>name is still occupied. could we rename the old block node first?
> 
> we can't rename a node, that's the problem.
> 
> 
> > +    PVE::Storage::volume_snapshot($storecfg, $volid, $snap);
> 
> >>(continued from above) and this invocation here are the same??
> The invocation is the same, but they it's not doing the same if it's an
> external snasphot.
> 
> >> wouldn't this already create the snapshot on the storage layer? 
> yes, it's create the (lvm volume) +  qcow2 file with preallocation
> 
> >>and didn't we just hardlink + reopen + unlink to transform the
> >>previous current volume into the snap volume?
> yes.
> here, we are creating the new current volume,  adding it to the graph
> with blockdev-add, then finally switch to it with blockdev-snapshot
> 
> The ugly trick in pve-storage is in plugin.pm
> #rename current volume to snap volume
> rename($path, $snappath) if -e $path && !-e $snappath;
> or in lvm plugin.
> eval { lvrename($vg, $volname, $snap_volname) } ;
> 
> 
> (and you have already made comment about them ;)
> 
> 
> because I'm re-using volume_snapshot (I didn't have to add a new method
> in pve-storage) to allocate the snasphot file, but indeed, we have
> already to the rename online.
> 
> 
> >>should this maybe have been vdisk_alloc and it just works by
> accident?
> It's not works fine with vdisk_alloc, because the volume need to be
> created without the size specified but with backing file param instead.
> (if I remember, qemu-img is looking at the backing file size+metadas
> and set the correct total size for the new volume)

I am not sure I follow.. we create a snapshot, but then we pretend it isn't a file with backing file when passing it to qemu? this seems wrong.. IMHO we should just allocate (+format) here, and then let qemu do the backing link up instead of this confusing (and error-prone, as it masks problems that should be a hard error!) call..

> Maybe a better way could be to reuse vdisk_alloc, and add backing file
> as param ?

that seems.. wrong as well? the file can never be bigger just because it has a backing file right? why can't we just allocate and format the regular volume?

> > +    my $new_file_blockdev = generate_file_blockdev($storecfg,
> > $drive, $new_current_file_nodename);
> > +    my $new_fmt_blockdev = generate_format_blockdev($storecfg,
> > $drive, $new_current_fmt_nodename, $new_file_blockdev);
> > +
> > +    $new_fmt_blockdev->{backing} = undef;
> 
> >>generate_format_blockdev doesn't set backing? 
> yes, it's adding backing
> 
> >>maybe this should be >>converted into an assertion?
> 
> but they are a limitation of the qmp blockdev-ad ++blockdev-snapshot
> where the backing attribute need undef in the blockdev-add or the
> blockdev-snapshot will fail because it's trying itself to set the
> backing file when doing the switch.
> 
> From my test, it was related to this
> https://lists.gnu.org/archive/html/qemu-block/2019-10/msg01404.html

yeah, what I mean is:

generate_format_blockdev doesn't (and should never) set backing. so setting it to undef has no effect. but we might want to assert that it *is* undef, so that if we ever mistakenly change generate_format_blockdev to set backing, it will be caught instead of silently being papered over..

> > +    PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add',
> > %$new_fmt_blockdev);
> > +    mon_cmd($vmid, 'blockdev-snapshot', node => $format_nodename,
> > overlay => $new_current_fmt_nodename);
> > +}
> > +
> > +sub blockdev_snap_rename {
> > +    my ($storecfg, $vmid, $deviceid, $drive, $src_path,
> > $target_path) = @_;
> 
> >>I think this whole thing needs more error handling and thought about
> >>how to recover from various points failing.. 
> yes, that's the problem with renaming, it's not atomic.
> 
> Also, if we need to recover (rollback), how to manage multiple disk ?

in general, a rollback with multiple disks that fails half-way through can only be recovered using another rollback, and that only works if the half-rollback is idempotent? another reason to avoid the need for renames wherever possible ;)

> >>there's also quite some overlap with blockdev_current_rename, I
> >>wonder whether it would be possible to simplify the code further by
> >merging the two? but see below, I think we can even get away with
> >>dropping this altogether if we switch from block-commit to block-
> >>stream..
> Yes, I have seperated them because I was not sure of the different
> workflow, and It was more simplier to fix one method without breaking
> the other.
> 
> I'll look to implement block-stream.  (and keep commit to initial image
> for the last snapshot delete)
> 
> 
> > +    #untaint
> > +    if ($src_path =~ m/^(\S+)$/) {
> > + $src_path = $1;
> > +    }
> > +    if ($target_path =~ m/^(\S+)$/) {
> > + $target_path = $1;
> > +    }
> 
> >>shouldn't that have happened in the storage plugin?
> >>
> > +
> > +    #create a hardlink
> > +    link($src_path, $target_path);
> 
> >>should this maybe be done by the storage plugin?
> 
> This was to avoid to introduce a sub method, but yes, it could be
> better indeed.
> 
> PVE::Storage::link  ?

the issue is that these are all storage-specific things done in qemu-server, and they should be done by the storage plugin, else a third-party plugin can never support external storages.. so we'd need to find the right level of abstraction to tell the storage plugin what to do when, and then the storage plugin can decide how it exposes both snapshot names with the same underlying snapshot (for a while)..


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

  parent reply	other threads:[~2025-01-13 13:27 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20241216091229.3142660-1-alexandre.derumier@groupe-cyllene.com>
2024-12-16  9:12 ` [pve-devel] [PATCH v1 pve-qemu 1/1] add block-commit-replaces option patch Alexandre Derumier via pve-devel
2025-01-08 13:27   ` Fabian Grünbichler
2025-01-10  7:55     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <34a164520eba035d1db5f70761b0f4aa59fecfa5.camel@groupe-cyllene.com>
2025-01-10  9:15       ` Fiona Ebner
2025-01-10  9:32         ` DERUMIER, Alexandre via pve-devel
     [not found]         ` <1e45e756801843dd46eb6ce2958d30885ad73bc2.camel@groupe-cyllene.com>
2025-01-13 14:28           ` Fiona Ebner
2025-01-14 10:10             ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 01/11] blockdev: cmdline: convert drive to blockdev syntax Alexandre Derumier via pve-devel
2025-01-08 14:17   ` Fabian Grünbichler
2025-01-10 13:50     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 pve-storage 1/3] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-01-09 12:36   ` Fabian Grünbichler
2025-01-10  9:10     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <f25028d41a9588e82889b3ef869a14f33cbd216e.camel@groupe-cyllene.com>
2025-01-10 11:02       ` Fabian Grünbichler
2025-01-10 11:51         ` DERUMIER, Alexandre via pve-devel
     [not found]         ` <1caecaa23e5d57030a9e31f2f0e67648f1930d6a.camel@groupe-cyllene.com>
2025-01-10 12:20           ` Fabian Grünbichler
2025-01-10 13:14             ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 02/11] blockdev: fix cfg2cmd tests Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 pve-storage 2/3] lvmplugin: add qcow2 snapshot Alexandre Derumier via pve-devel
2025-01-09 13:55   ` Fabian Grünbichler
2025-01-10 10:16     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 03/11] blockdev : convert qemu_driveadd && qemu_drivedel Alexandre Derumier via pve-devel
2025-01-08 14:26   ` Fabian Grünbichler
2025-01-10 14:08     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 pve-storage 3/3] storage: vdisk_free: remove external snapshots Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 04/11] blockdev: vm_devices_list : fix block-query Alexandre Derumier via pve-devel
2025-01-08 14:31   ` Fabian Grünbichler
2025-01-13  7:56     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 05/11] blockdev: convert cdrom media eject/insert Alexandre Derumier via pve-devel
2025-01-08 14:34   ` Fabian Grünbichler
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 06/11] blockdev: block_resize: convert to blockdev Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 07/11] blockdev: nbd_export: block-export-add : use drive-$id for nodename Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 08/11] blockdev: convert drive_mirror to blockdev_mirror Alexandre Derumier via pve-devel
2025-01-08 15:19   ` Fabian Grünbichler
2025-01-13  8:27     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <0d0d4c4d73110cf0e692cae0ee65bf7f9a6ce93a.camel@groupe-cyllene.com>
2025-01-13  9:52       ` Fabian Grünbichler
2025-01-13  9:55         ` Fabian Grünbichler
2025-01-13 10:47         ` DERUMIER, Alexandre via pve-devel
2025-01-13 13:42           ` Fiona Ebner
2025-01-14 10:03             ` DERUMIER, Alexandre via pve-devel
     [not found]             ` <fa38efbd95b57ba57a5628d6acfcda9d5875fa82.camel@groupe-cyllene.com>
2025-01-15  9:39               ` Fiona Ebner
2025-01-15  9:51                 ` Fabian Grünbichler
2025-01-15 10:06                   ` Fiona Ebner
2025-01-15 10:15                     ` Fabian Grünbichler
2025-01-15 10:46                       ` Fiona Ebner
2025-01-15 10:50                         ` Fabian Grünbichler
2025-01-15 11:01                           ` Fiona Ebner
2025-01-15 13:01                       ` DERUMIER, Alexandre via pve-devel
2025-01-16 14:56                     ` DERUMIER, Alexandre via pve-devel
2025-01-15 10:15                   ` DERUMIER, Alexandre via pve-devel
     [not found]         ` <c1559499319052d6cf10900efd5376c12389a60f.camel@groupe-cyllene.com>
2025-01-13 13:31           ` Fabian Grünbichler
2025-01-20 13:37             ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 09/11] blockdev: mirror: change aio on target if io_uring is not default Alexandre Derumier via pve-devel
2025-01-09  9:51   ` Fabian Grünbichler
2025-01-13  8:38     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 10/11] blockdev: add backing_chain support Alexandre Derumier via pve-devel
2025-01-09 11:57   ` Fabian Grünbichler
2025-01-13  8:53     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 11/11] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-01-09 11:57   ` Fabian Grünbichler
2025-01-09 13:19     ` Fabio Fantoni via pve-devel
2025-01-20 13:44       ` DERUMIER, Alexandre via pve-devel
     [not found]       ` <3307ec388a763510ec78f97ed9f0de00c87d54b5.camel@groupe-cyllene.com>
2025-01-20 14:29         ` Fabio Fantoni via pve-devel
     [not found]         ` <6bdfe757-ae04-42e1-b197-c9ddb873e353@m2r.biz>
2025-01-20 14:41           ` DERUMIER, Alexandre via pve-devel
2025-01-13 10:08     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <0ae72889042e006d9202e837aac7ecf2b413e1b4.camel@groupe-cyllene.com>
2025-01-13 13:27       ` Fabian Grünbichler [this message]
2025-01-13 18:07         ` DERUMIER, Alexandre via pve-devel
2025-01-13 18:58           ` DERUMIER, Alexandre via pve-devel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=601608293.2958.1736774823822@webmail.proxmox.com \
    --to=f.gruenbichler@proxmox.com \
    --cc=alexandre.derumier@groupe-cyllene.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal