public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>,
	"pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH v3 pve-storage 1/3] qcow2: add external snapshot support
Date: Fri, 10 Jan 2025 12:02:04 +0100 (CET)	[thread overview]
Message-ID: <599238234.1889.1736506924431@webmail.proxmox.com> (raw)
In-Reply-To: <f25028d41a9588e82889b3ef869a14f33cbd216e.camel@groupe-cyllene.com>


> DERUMIER, Alexandre <alexandre.derumier@groupe-cyllene.com> hat am 10.01.2025 10:10 CET geschrieben:
> > +    if ($scfg->{snapext}) {
> > + #technically, we could manage multibranch, we it need lot more work
> > for snapshot delete
> > + #we need to implemente block-stream from deleted snapshot to all
> > others child branchs
> 
> >>see my comments in qemu-server - I think we actually want block-
> >>stream anyway, since it has the semantics we want..
> 
> I don't agree, we don't want always, because with block-stream, you
> need to copy parent to child.
> 
> for example, you have a 1TB image,  you take a snapshot, writing 5MB in
> the snapshot, delete the snapshot,  you'll need to read/copy 1TB data
> from parent to the snapshot file.  
> I don't read your qemu-server comment yet ;)

yes, for the "first" snapshot that is true (since that one is basically the baseline data, which will often be huge compared to the snapshot delta). but streaming (rebasing) saves us the rename, which makes the error handling a lot easier/less risky. maybe we could special case the first snapshot as a performance optimization? ;)

> > @@ -1201,13 +1257,52 @@ sub volume_snapshot_delete {
> >  
> >      return 1 if $running;
> >  
> > +    my $cmd = "";
> >      my $path = $class->filesystem_path($scfg, $volname);
> >  
> > -    $class->deactivate_volume($storeid, $scfg, $volname, $snap, {});
> > +    if ($scfg->{snapext}) {
> >  
> > -    my $cmd = ['/usr/bin/qemu-img', 'snapshot','-d', $snap, $path];
> > + my $snapshots = $class->volume_snapshot_info($scfg, $storeid,
> > $volname);
> > + my $snappath = $snapshots->{$snap}->{file};
> > + return if !-e $snappath;  #already deleted ?
> 
> >>shouldn't this be an error?
> 
> This one was if we want to do retry in case of error, if we have
> multiple disks. (for example, first snapshot delete api call,  the
> first disk remove the snapshot, but a bug occur and second disk don't
> remove the snapshot). 
> 
> User could want to unlock the vm-snaphot lock and  and fix it manually
> with calling again the snapshot delete.
> 
> I'm not sure how to handle this correctly ?

I think the force parameter for snapshot deletion covers this already, and it should be fine for this to die..

> 
> > +     print"commit $childpath\n";
> > +     $cmd = ['/usr/bin/qemu-img', 'commit', $childpath];
> > +     run_command($cmd);
> > +     print"delete $childpath\n";
> > +
> > +     unlink($childpath);
> 
> this unlink can be skipped?
> 
> > +     print"rename $snappath to $childpath\n";
> > +     rename($snappath, $childpath);
> 
> >>since this will overwrite $childpath anyway.. this also reduces the
> >>chance of something going wrong:
> >>
> >>- if the commit fails halfway through, nothing bad should have
> >>happened, other than some data is now stored in two snapshots and
> >>takes up extra space
> >>- if the rename fails, then all of the data of $snap is stored twice,
> >>but the backing chain is still valid
> >>
> >>notable, there is no longer a gap where $childpath doesn't exist,
> >>which would break the backing chain!
> 
> yes you are right, better to have it atomic indeed
> 
> 
> > + } else {
> > +     print"commit $snappath\n";
> > +     $cmd = ['/usr/bin/qemu-img', 'commit', $snappath];
> 
> >>leftover from previous version? not used/overwritten below ;)
> 
> no, this is really to commit the the snapshot to parent

but it is not executed..

> 
> > +     #if we delete an intermediate snapshot, we need to link upper
> > snapshot to base snapshot
> > +     die "missing parentsnap snapshot to rebase child $childpath\n"
> > if !$parentpath;
> > +     print "link $childsnap to $parentsnap\n";
> > +     $cmd = ['/usr/bin/qemu-img', 'rebase', '-u', '-b', $parentpath,
> > '-F', 'qcow2', '-f', 'qcow2', $childpath];
> 
> >>does this work? I would read the qemu-img manpage to say that '-u' is
> >>for when you've moved/converted the backing file, and want to update
> >>the reference in its overlay, and that it doesn't copy any data.. but
> >>we need to copy the data from $snap to $childpath (we just want to
> >>delete the snapshot, we don't want to drop all its changes from the
> >>history, that would corrupt the contents of the image).
> >>note the description of the "safe" variant:
> >>
> >>"                     This  is  the  default mode and performs a real
> >>rebase operation. The new backing file may differ from the old one
> >>and qemu-img rebase will take care of keeping the
> >>                     guest-visible content of FILENAME unchanged."
> >>
> >>IMHO this is the behaviour we need here?
> 
> This is only to change the backing chain ref in the qcow2 snapshot.
> (this is the only way to do it, they was a qemu-img ammend command in
> past, but it has been removed in
> 2020 https://patchwork.kernel.org/project/qemu-devel/patch/20200403175859.863248-5-eblake@redhat.com/,
> so the rebase is the good way to do it)
> 
> The merge is done by the previous qemu-img commit. (qemu-img commit
> can't change  change automatically the backing chain of the upper
> snapshot, because it don't have any idea than an upper snapshot could
> exist).

see above and below ;)

> this is for this usecase :
> 
> A<----B<----C.
> 
> you commit B to A,  then you need to change the backing file of C to A
> (instead B)
> 
> A<----C

but this is the wrong semantics.. the writes/delta in B need to go to C (they happened after A), not to A!

> (when done it live, qemu qmp block-commit is able to change
> automatically the backing chain of the upper snapshot, because qemu
> known the whole chain)

I think it's wrong there as well, see my comments on those patches ;)

> This is how libvirt is doing too
> https://kashyapc.fedorapeople.org/virt/lc-2012/snapshots-handout.html
> see "Deleting snapshots (and 'offline commit')"
> Method (1): base <- sn1 <- sn3 (by copying sn2 into sn1)
> Method (2): base <- sn1 <- sn3 (by copying sn2 into sn3)
> (This is commit vs stream)

but they use the "wrong" (v1) naming scheme where the name of the snapshot and the content don't line up..

> I think that we should look at used space of parent vs child,
> to choose the correct direction/method.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

  parent reply	other threads:[~2025-01-10 11:02 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20241216091229.3142660-1-alexandre.derumier@groupe-cyllene.com>
2024-12-16  9:12 ` [pve-devel] [PATCH v1 pve-qemu 1/1] add block-commit-replaces option patch Alexandre Derumier via pve-devel
2025-01-08 13:27   ` Fabian Grünbichler
2025-01-10  7:55     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <34a164520eba035d1db5f70761b0f4aa59fecfa5.camel@groupe-cyllene.com>
2025-01-10  9:15       ` Fiona Ebner
2025-01-10  9:32         ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 01/11] blockdev: cmdline: convert drive to blockdev syntax Alexandre Derumier via pve-devel
2025-01-08 14:17   ` Fabian Grünbichler
2025-01-10 13:50     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 pve-storage 1/3] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-01-09 12:36   ` Fabian Grünbichler
2025-01-10  9:10     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <f25028d41a9588e82889b3ef869a14f33cbd216e.camel@groupe-cyllene.com>
2025-01-10 11:02       ` Fabian Grünbichler [this message]
2025-01-10 11:51         ` DERUMIER, Alexandre via pve-devel
     [not found]         ` <1caecaa23e5d57030a9e31f2f0e67648f1930d6a.camel@groupe-cyllene.com>
2025-01-10 12:20           ` Fabian Grünbichler
2025-01-10 13:14             ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 02/11] blockdev: fix cfg2cmd tests Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 pve-storage 2/3] lvmplugin: add qcow2 snapshot Alexandre Derumier via pve-devel
2025-01-09 13:55   ` Fabian Grünbichler
2025-01-10 10:16     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 03/11] blockdev : convert qemu_driveadd && qemu_drivedel Alexandre Derumier via pve-devel
2025-01-08 14:26   ` Fabian Grünbichler
2025-01-10 14:08     ` DERUMIER, Alexandre via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 pve-storage 3/3] storage: vdisk_free: remove external snapshots Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 04/11] blockdev: vm_devices_list : fix block-query Alexandre Derumier via pve-devel
2025-01-08 14:31   ` Fabian Grünbichler
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 05/11] blockdev: convert cdrom media eject/insert Alexandre Derumier via pve-devel
2025-01-08 14:34   ` Fabian Grünbichler
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 06/11] blockdev: block_resize: convert to blockdev Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 07/11] blockdev: nbd_export: block-export-add : use drive-$id for nodename Alexandre Derumier via pve-devel
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 08/11] blockdev: convert drive_mirror to blockdev_mirror Alexandre Derumier via pve-devel
2025-01-08 15:19   ` Fabian Grünbichler
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 09/11] blockdev: mirror: change aio on target if io_uring is not default Alexandre Derumier via pve-devel
2025-01-09  9:51   ` Fabian Grünbichler
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 10/11] blockdev: add backing_chain support Alexandre Derumier via pve-devel
2025-01-09 11:57   ` Fabian Grünbichler
2024-12-16  9:12 ` [pve-devel] [PATCH v3 qemu-server 11/11] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-01-09 11:57   ` Fabian Grünbichler
2025-01-09 13:19     ` Fabio Fantoni via pve-devel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=599238234.1889.1736506924431@webmail.proxmox.com \
    --to=f.gruenbichler@proxmox.com \
    --cc=alexandre.derumier@groupe-cyllene.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal