all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>,
	"pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH v4 pve-storage 1/5] qcow2: add external snapshot support
Date: Wed, 2 Apr 2025 10:28:35 +0200 (CEST)	[thread overview]
Message-ID: <1188660913.4446.1743582515306@webmail.proxmox.com> (raw)
In-Reply-To: <0e2cd118f35aa8d4c410d362fea1a1b366df1570.camel@groupe-cyllene.com>


> DERUMIER, Alexandre <alexandre.derumier@groupe-cyllene.com> hat am 02.04.2025 10:01 CEST geschrieben:
> 
>  
> >  
> > @@ -716,7 +721,11 @@ sub filesystem_path {
> >  
> >      my $dir = $class->get_subdir($scfg, $vtype);
> >  
> > -    $dir .= "/$vmid" if $vtype eq 'images';
> > +    if ($scfg->{snapext} && $snapname) {
> > + $name = $class->get_snap_volname($volname, $snapname);
> > +    } else {
> > + $dir .= "/$vmid" if $vtype eq 'images';
> > +    }
> 
> >>this is a bit weird, as it mixes volnames (with the `$vmid/` prefix)
> >>and names (without), it's only called twice in this patch, and this
> >>here already has $volname parsed, so could we maybe let
> >>get_snap_volname take and return the $name part without the dir?
> 
> ok!
> 
> >  
> >      my $path = "$dir/$name";
> >  
> > @@ -873,7 +882,7 @@ sub clone_image {
> >  }
> >  
> >  sub alloc_image {
> > -    my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
> > +    my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size,
> > $backing) = @_;
> 
> >>this extends the storage API, so it should actually do that.. and
> >>probably $backing should not be an arbitrary path, but something that
> >>is resolved locally?
> 
> I'll send the $snapname as param instead

see my comments on the qemu-server side, I think it would be even better if we could just get rid of extending alloc_image like this, and instead always go via volume_snapshot..

> 
> 
> > +
> > +    if($backing) {
> > + push @$cmd, '-b', $backing, '-F', 'qcow2';
> > + push @$options, 'extended_l2=on','cluster_size=128k';
> > +    };
> > +    push @$options, preallocation_cmd_option($scfg, $fmt);
> > +    push @$cmd, '-o', join(',', @$options) if @$options > 0;
> > +    push @$cmd, '-f', $fmt, $path;
> > +    push @$cmd, "${size}K" if !$backing;
> 
> >>is this because it will automatically take the size from the backing
> >>image?
> 
> Yes, it take size from the backing.  (and refuse if you specify size
> param at the same time than backing file)

we pass a size and a backing file in qemu-server, so I guess that is wrong there? ;)

> 
> 
> > + my $path = $class->path($scfg, $volname, $storeid);
> > + my $snappath = $class->path($scfg, $volname, $storeid, $snap);
> > + #rename current volume to snap volume
> > + die "snapshot volume $snappath already exist\n" if -e $snappath;
> > + rename($path, $snappath) if -e $path;
> 
> >>this is still looking weird.. I don't think it makes sense interface
> >>wise to allow snapshotting a volume that doesn't even exist..
> 
> This is more by security, I'm still unsure of the behaviour if you have
> multiple disks, and that snapshot is dying in the middle. (1 disk
> rename, the other not renamed). 

I am not sure what we gain by this other than papering over issues.

for multi-disks what we'd need to do is:
- loop over volumes
-- take a snapshot of volume
-- as part of error handling directly in taking a snapshot, undo all steps done until the point of error
- if an error occurs, loop over all already correctly snapshotted volumes
-- undo snapshot of each such volume

> 
> > +
> > + my ($vtype, $name, $vmid, undef, undef, $isBase, $format) =
> > +     $class->parse_volname($volname);
> > +
> > + $class->alloc_image($storeid, $scfg, $vmid, 'qcow2', $name, undef,
> > $snappath);
> > + if ($@) {
> > +     eval { $class->free_image($storeid, $scfg, $volname, 0) };
> > +     warn $@ if $@;
> 
> >>missing cleanup - this should undo the rename from above
> 
> Do you have an idea how to do it with mutiple disk ?  
> (revert renaming of other disks elsewhere in the code? just keep them
> like this)? 

see above, the volume_snapshot task should clean up what it did up to the point of error, and its caller is responsible for undoing already completed volume_snapshot calls. obviously this won't always work (for example, if the snapshot fails because the storage has a problem, it's very likely that undoing things is not possible either and manual cleanup will be required)

> 
> 
> >  
> > @@ -1187,9 +1251,15 @@ sub volume_snapshot_rollback {
> >  
> >      my $path = $class->filesystem_path($scfg, $volname);
> >  
> > -    my $cmd = ['/usr/bin/qemu-img', 'snapshot','-a', $snap, $path];
> > -
> > -    run_command($cmd);
> > +    if ($scfg->{snapext}) {
> > + #simply delete the current snapshot and recreate it
> > + my $path = $class->filesystem_path($scfg, $volname);
> > + unlink($path);
> > + $class->volume_snapshot($scfg, $storeid, $volname, $snap);
> 
> >>instead of volume_snapshot, this could simply call alloc_image with
> >>the backing file? then volume_snapshot could always rename and always
> >>cleanup properly..
> 
> Yes , better like this indeed

I think alloc_image should be split into an internal helper that takes the backing file parameter, and the existing alloc_image that shouldn't get that parameter though

> 
> > 
> > + } else {
> > +     #we rebase the child image on the parent as new backing image
> 
> >>should we extend this to make it clear what this means? it means
> >>copying any parts of $snap that are not in $parent and not yet
> >>overwritten by $child into $child, right?
> >>
> yes,
> intermediate snapshot: (rebase)
> -------------------------------
> snap1 (10G)-->snap2 (1G)----current(1G)
> ---> delete snap2
> 
> rebase current on snap1
> 
> snap1(10G)----->current(2G)
> 
> 
> or
> 
> snap1 (10G)-->snap2 (1G)----> snap3 (1G)--->current(1G)
> ---> delete snap2
> 
> rebase snap3 on snap1
> 
> snap1 (10G)---> snap3 (2G)--->current(1G)
> 
> 
> 
> >>so how expensive this is depends on:
> >>- how many changes are between $parent and $snap (increases cost)
> >>- how many of those are overwritten by changes between $snap and
> >>$child (decreases cost)
> 
> 
> but yes, the final size of the child is not 100% the additional content
> of the deleted snapshot, if some blocks has already been overwriten in
> the child
> 
> 
> so, we could call it: "merge diff content of the delete snap to the
> childsnap"
> 
> 
> 
> 
> 
> >>+sub get_snap_volname {
> >>+    my ($class, $volname, $snapname) = @_;
> >>+
> >>+    my ($vtype, $name, $vmid, $basename, $basevmid, $isBase,
> >>$format) = $class->parse_volname($volname);
> +    $name = !$snapname || $snapname eq 'current' ? $volname :
> "$vmid/snap-$snapname-$name";
> 
> >>other way round would be better to group by volume first IMHO
> ($vmid/snap-$name-$snapname), as this is similar to how we encode
> >>snapshots often on the storage level (volume@snap). we also need to
> >>have some delimiter between snapshot and volume name that is not
> >>allowed in either (hard for volname since basically everything but
> >>'/' goes, but snapshots have a restricted character set (configid,
> >>which means alphanumeric, hyphen and underscore), so we could use
> >>something like '.' as delimiter? or we switch to directories and do
> >>$vmid/snap/$snap/$name?)
> 
> Personnaly, I prefer a '.' delimiter than sub directory. (better to
> have the info in the filename itself)

we can still bikeshed this later once other issue are ironed out, I just wanted to note it - it's very unfortunate that volume names currently are so unrestricted, but we have to live with it for now. I just realized that even snap-XXX.qcow2 would be recognized/returned as image, so we need something else or filter them out anyway.


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

  parent reply	other threads:[~2025-04-02  8:29 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250311102905.2680524-1-alexandre.derumier@groupe-cyllene.com>
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-qemu 1/1] add block-commit-replaces option patch Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 01/11] blockdev: cmdline: convert drive to blockdev syntax Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 1/5] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-04-01 13:50   ` Fabian Grünbichler
2025-04-02  8:01     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <0e2cd118f35aa8d4c410d362fea1a1b366df1570.camel@groupe-cyllene.com>
2025-04-02  8:28       ` Fabian Grünbichler [this message]
2025-04-03  4:27         ` DERUMIER, Alexandre via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 02/11] blockdev : convert qemu_driveadd && qemu_drivedel Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 2/5] lvmplugin: add qcow2 snapshot Alexandre Derumier via pve-devel
2025-04-01 13:50   ` Fabian Grünbichler
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 03/11] replace qemu_block_set_io_throttle with qom-set throttlegroup limits Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 3/5] storage: vdisk_free: remove external snapshots Alexandre Derumier via pve-devel
2025-04-01 13:50   ` Fabian Grünbichler
2025-04-07 11:02     ` DERUMIER, Alexandre via pve-devel
2025-04-07 11:29     ` DERUMIER, Alexandre via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 04/11] blockdev: vm_devices_list : fix block-query Alexandre Derumier via pve-devel
2025-04-02  8:10   ` Fabian Grünbichler
2025-04-11 17:32     ` DERUMIER, Alexandre via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 4/5] lvm: lvrename helper: allow path Alexandre Derumier via pve-devel
2025-04-01 13:50   ` Fabian Grünbichler
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 05/11] blockdev: convert cdrom media eject/insert Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 5/5] lvm: add lvremove helper Alexandre Derumier via pve-devel
2025-04-01 13:50   ` Fabian Grünbichler
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 06/11] blockdev: block_resize: convert to blockdev Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 07/11] blockdev: nbd_export: block-export-add : use drive-$id for nodename Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 08/11] blockdev: convert drive_mirror to blockdev_mirror Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 09/11] blockdev: change aio on target if io_uring is not default Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 10/11] blockdev: add backing_chain support Alexandre Derumier via pve-devel
2025-04-02  8:10   ` Fabian Grünbichler
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 11/11] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-04-02  8:10   ` Fabian Grünbichler
2025-04-03  4:51     ` DERUMIER, Alexandre via pve-devel
2025-04-04 11:31     ` DERUMIER, Alexandre via pve-devel
     [not found]     ` <3e516016a970e52e5a1014dbcd6cf9507581da74.camel@groupe-cyllene.com>
2025-04-04 11:37       ` Fabian Grünbichler
2025-04-04 13:02         ` DERUMIER, Alexandre via pve-devel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1188660913.4446.1743582515306@webmail.proxmox.com \
    --to=f.gruenbichler@proxmox.com \
    --cc=alexandre.derumier@groupe-cyllene.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal