From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: "DERUMIER, Alexandre" <alexandre.derumier@groupe-cyllene.com>,
"pve-devel@lists.proxmox.com" <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH v4 pve-storage 1/5] qcow2: add external snapshot support
Date: Wed, 2 Apr 2025 10:28:35 +0200 (CEST) [thread overview]
Message-ID: <1188660913.4446.1743582515306@webmail.proxmox.com> (raw)
In-Reply-To: <0e2cd118f35aa8d4c410d362fea1a1b366df1570.camel@groupe-cyllene.com>
> DERUMIER, Alexandre <alexandre.derumier@groupe-cyllene.com> hat am 02.04.2025 10:01 CEST geschrieben:
>
>
> >
> > @@ -716,7 +721,11 @@ sub filesystem_path {
> >
> > my $dir = $class->get_subdir($scfg, $vtype);
> >
> > - $dir .= "/$vmid" if $vtype eq 'images';
> > + if ($scfg->{snapext} && $snapname) {
> > + $name = $class->get_snap_volname($volname, $snapname);
> > + } else {
> > + $dir .= "/$vmid" if $vtype eq 'images';
> > + }
>
> >>this is a bit weird, as it mixes volnames (with the `$vmid/` prefix)
> >>and names (without), it's only called twice in this patch, and this
> >>here already has $volname parsed, so could we maybe let
> >>get_snap_volname take and return the $name part without the dir?
>
> ok!
>
> >
> > my $path = "$dir/$name";
> >
> > @@ -873,7 +882,7 @@ sub clone_image {
> > }
> >
> > sub alloc_image {
> > - my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size) = @_;
> > + my ($class, $storeid, $scfg, $vmid, $fmt, $name, $size,
> > $backing) = @_;
>
> >>this extends the storage API, so it should actually do that.. and
> >>probably $backing should not be an arbitrary path, but something that
> >>is resolved locally?
>
> I'll send the $snapname as param instead
see my comments on the qemu-server side, I think it would be even better if we could just get rid of extending alloc_image like this, and instead always go via volume_snapshot..
>
>
> > +
> > + if($backing) {
> > + push @$cmd, '-b', $backing, '-F', 'qcow2';
> > + push @$options, 'extended_l2=on','cluster_size=128k';
> > + };
> > + push @$options, preallocation_cmd_option($scfg, $fmt);
> > + push @$cmd, '-o', join(',', @$options) if @$options > 0;
> > + push @$cmd, '-f', $fmt, $path;
> > + push @$cmd, "${size}K" if !$backing;
>
> >>is this because it will automatically take the size from the backing
> >>image?
>
> Yes, it take size from the backing. (and refuse if you specify size
> param at the same time than backing file)
we pass a size and a backing file in qemu-server, so I guess that is wrong there? ;)
>
>
> > + my $path = $class->path($scfg, $volname, $storeid);
> > + my $snappath = $class->path($scfg, $volname, $storeid, $snap);
> > + #rename current volume to snap volume
> > + die "snapshot volume $snappath already exist\n" if -e $snappath;
> > + rename($path, $snappath) if -e $path;
>
> >>this is still looking weird.. I don't think it makes sense interface
> >>wise to allow snapshotting a volume that doesn't even exist..
>
> This is more by security, I'm still unsure of the behaviour if you have
> multiple disks, and that snapshot is dying in the middle. (1 disk
> rename, the other not renamed).
I am not sure what we gain by this other than papering over issues.
for multi-disks what we'd need to do is:
- loop over volumes
-- take a snapshot of volume
-- as part of error handling directly in taking a snapshot, undo all steps done until the point of error
- if an error occurs, loop over all already correctly snapshotted volumes
-- undo snapshot of each such volume
>
> > +
> > + my ($vtype, $name, $vmid, undef, undef, $isBase, $format) =
> > + $class->parse_volname($volname);
> > +
> > + $class->alloc_image($storeid, $scfg, $vmid, 'qcow2', $name, undef,
> > $snappath);
> > + if ($@) {
> > + eval { $class->free_image($storeid, $scfg, $volname, 0) };
> > + warn $@ if $@;
>
> >>missing cleanup - this should undo the rename from above
>
> Do you have an idea how to do it with mutiple disk ?
> (revert renaming of other disks elsewhere in the code? just keep them
> like this)?
see above, the volume_snapshot task should clean up what it did up to the point of error, and its caller is responsible for undoing already completed volume_snapshot calls. obviously this won't always work (for example, if the snapshot fails because the storage has a problem, it's very likely that undoing things is not possible either and manual cleanup will be required)
>
>
> >
> > @@ -1187,9 +1251,15 @@ sub volume_snapshot_rollback {
> >
> > my $path = $class->filesystem_path($scfg, $volname);
> >
> > - my $cmd = ['/usr/bin/qemu-img', 'snapshot','-a', $snap, $path];
> > -
> > - run_command($cmd);
> > + if ($scfg->{snapext}) {
> > + #simply delete the current snapshot and recreate it
> > + my $path = $class->filesystem_path($scfg, $volname);
> > + unlink($path);
> > + $class->volume_snapshot($scfg, $storeid, $volname, $snap);
>
> >>instead of volume_snapshot, this could simply call alloc_image with
> >>the backing file? then volume_snapshot could always rename and always
> >>cleanup properly..
>
> Yes , better like this indeed
I think alloc_image should be split into an internal helper that takes the backing file parameter, and the existing alloc_image that shouldn't get that parameter though
>
> >
> > + } else {
> > + #we rebase the child image on the parent as new backing image
>
> >>should we extend this to make it clear what this means? it means
> >>copying any parts of $snap that are not in $parent and not yet
> >>overwritten by $child into $child, right?
> >>
> yes,
> intermediate snapshot: (rebase)
> -------------------------------
> snap1 (10G)-->snap2 (1G)----current(1G)
> ---> delete snap2
>
> rebase current on snap1
>
> snap1(10G)----->current(2G)
>
>
> or
>
> snap1 (10G)-->snap2 (1G)----> snap3 (1G)--->current(1G)
> ---> delete snap2
>
> rebase snap3 on snap1
>
> snap1 (10G)---> snap3 (2G)--->current(1G)
>
>
>
> >>so how expensive this is depends on:
> >>- how many changes are between $parent and $snap (increases cost)
> >>- how many of those are overwritten by changes between $snap and
> >>$child (decreases cost)
>
>
> but yes, the final size of the child is not 100% the additional content
> of the deleted snapshot, if some blocks has already been overwriten in
> the child
>
>
> so, we could call it: "merge diff content of the delete snap to the
> childsnap"
>
>
>
>
>
> >>+sub get_snap_volname {
> >>+ my ($class, $volname, $snapname) = @_;
> >>+
> >>+ my ($vtype, $name, $vmid, $basename, $basevmid, $isBase,
> >>$format) = $class->parse_volname($volname);
> + $name = !$snapname || $snapname eq 'current' ? $volname :
> "$vmid/snap-$snapname-$name";
>
> >>other way round would be better to group by volume first IMHO
> ($vmid/snap-$name-$snapname), as this is similar to how we encode
> >>snapshots often on the storage level (volume@snap). we also need to
> >>have some delimiter between snapshot and volume name that is not
> >>allowed in either (hard for volname since basically everything but
> >>'/' goes, but snapshots have a restricted character set (configid,
> >>which means alphanumeric, hyphen and underscore), so we could use
> >>something like '.' as delimiter? or we switch to directories and do
> >>$vmid/snap/$snap/$name?)
>
> Personnaly, I prefer a '.' delimiter than sub directory. (better to
> have the info in the filename itself)
we can still bikeshed this later once other issue are ironed out, I just wanted to note it - it's very unfortunate that volume names currently are so unrestricted, but we have to live with it for now. I just realized that even snap-XXX.qcow2 would be recognized/returned as image, so we need something else or filter them out anyway.
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
next prev parent reply other threads:[~2025-04-02 8:29 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20250311102905.2680524-1-alexandre.derumier@groupe-cyllene.com>
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-qemu 1/1] add block-commit-replaces option patch Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 01/11] blockdev: cmdline: convert drive to blockdev syntax Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 1/5] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-04-01 13:50 ` Fabian Grünbichler
2025-04-02 8:01 ` DERUMIER, Alexandre via pve-devel
[not found] ` <0e2cd118f35aa8d4c410d362fea1a1b366df1570.camel@groupe-cyllene.com>
2025-04-02 8:28 ` Fabian Grünbichler [this message]
2025-04-03 4:27 ` DERUMIER, Alexandre via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 02/11] blockdev : convert qemu_driveadd && qemu_drivedel Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 2/5] lvmplugin: add qcow2 snapshot Alexandre Derumier via pve-devel
2025-04-01 13:50 ` Fabian Grünbichler
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 03/11] replace qemu_block_set_io_throttle with qom-set throttlegroup limits Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 3/5] storage: vdisk_free: remove external snapshots Alexandre Derumier via pve-devel
2025-04-01 13:50 ` Fabian Grünbichler
2025-04-07 11:02 ` DERUMIER, Alexandre via pve-devel
2025-04-07 11:29 ` DERUMIER, Alexandre via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 04/11] blockdev: vm_devices_list : fix block-query Alexandre Derumier via pve-devel
2025-04-02 8:10 ` Fabian Grünbichler
2025-04-11 17:32 ` DERUMIER, Alexandre via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 4/5] lvm: lvrename helper: allow path Alexandre Derumier via pve-devel
2025-04-01 13:50 ` Fabian Grünbichler
2025-03-11 10:28 ` [pve-devel] [PATCH v4 qemu-server 05/11] blockdev: convert cdrom media eject/insert Alexandre Derumier via pve-devel
2025-03-11 10:28 ` [pve-devel] [PATCH v4 pve-storage 5/5] lvm: add lvremove helper Alexandre Derumier via pve-devel
2025-04-01 13:50 ` Fabian Grünbichler
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 06/11] blockdev: block_resize: convert to blockdev Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 07/11] blockdev: nbd_export: block-export-add : use drive-$id for nodename Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 08/11] blockdev: convert drive_mirror to blockdev_mirror Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 09/11] blockdev: change aio on target if io_uring is not default Alexandre Derumier via pve-devel
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 10/11] blockdev: add backing_chain support Alexandre Derumier via pve-devel
2025-04-02 8:10 ` Fabian Grünbichler
2025-03-11 10:29 ` [pve-devel] [PATCH v4 qemu-server 11/11] qcow2: add external snapshot support Alexandre Derumier via pve-devel
2025-04-02 8:10 ` Fabian Grünbichler
2025-04-03 4:51 ` DERUMIER, Alexandre via pve-devel
2025-04-04 11:31 ` DERUMIER, Alexandre via pve-devel
[not found] ` <3e516016a970e52e5a1014dbcd6cf9507581da74.camel@groupe-cyllene.com>
2025-04-04 11:37 ` Fabian Grünbichler
2025-04-04 13:02 ` DERUMIER, Alexandre via pve-devel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1188660913.4446.1743582515306@webmail.proxmox.com \
--to=f.gruenbichler@proxmox.com \
--cc=alexandre.derumier@groupe-cyllene.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal