From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pve-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) by lore.proxmox.com (Postfix) with ESMTPS id 9AA361FF17C for <inbox@lore.proxmox.com>; Wed, 2 Apr 2025 10:10:51 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 95941BC6D; Wed, 2 Apr 2025 10:10:39 +0200 (CEST) Date: Wed, 2 Apr 2025 10:10:33 +0200 (CEST) From: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= <f.gruenbichler@proxmox.com> To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> Message-ID: <476324959.4386.1743581433778@webmail.proxmox.com> In-Reply-To: <mailman.965.1741689000.293.pve-devel@lists.proxmox.com> References: <20250311102905.2680524-1-alexandre.derumier@groupe-cyllene.com> <mailman.965.1741689000.293.pve-devel@lists.proxmox.com> MIME-Version: 1.0 X-Priority: 3 Importance: Normal X-Mailer: Open-Xchange Mailer v7.10.6-Rev75 X-Originating-Client: open-xchange-appsuite X-SPAM-LEVEL: Spam detection results: 0 AWL 0.044 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [drive.pm, qemuserver.pm, qemuconfig.pm] Subject: Re: [pve-devel] [PATCH v4 qemu-server 11/11] qcow2: add external snapshot support X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/> List-Post: <mailto:pve-devel@lists.proxmox.com> List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com> commit description missing here as well.. I haven't tested this (or the first patches doing the blockdev conversion) yet, but I see a few bigger design/architecture issues left (besides FIXMEs for missing pieces that previously worked ;)): - we should probably move the decision whether a snapshot is done on the storage layer or by qemu into the control of the storage plugin, especially since we are currently cleaning that API up to allow easier implementation of external plugins - if we do that, we can also make "uses external qcow2 snapshots" a property of the storage plugin+config to replace hard-coded checks for the snapext property or lvm+qcow2 - there are a few operations here that should not call directly into the storage plugin code or do equivalent actions, but should rather get a proper interface in that storage plugin API the first one is the renaming of a blockdev while it is used, which is currently done like this: -- "link" snapshot path to make it available under old and new name -- handle blockdev additions/reopening/backing-file updates/deletions on the qemu layer -- remove old snapshot path link -- if LVM, rename actual volume (for non-LVM, linking followed by unlinking the source is effectively a rename already) I wonder whether that couldn't be made more straight-forward by doing -- rename snapshot volume/image (qemu must already have the old name open anyway and should be able to continue using it) -- do blockdev additions/reopening/backing-file updates/deletions on the qemu layer or is there an issue/check in qemu somewhere that prevents this approach? if not, we could just introduce a "volume_snapshot_rename" or extend rename_volume with a snapshot parameter.. the second thing that happens is deleting a snapshot volume/path, without deleting the whole snapshot.. that one we could easily support by extending volume_snapshot_delete by extending the $running parameter (e.g., passing "2") or adding a new one to signify that all the housekeeping was already done, and just the actual snapshot volume should be deleted. this shouldn't be an issue provided all such calls are guarded by first checking that we are using external snapshots.. > Alexandre Derumier via pve-devel <pve-devel@lists.proxmox.com> hat am 11.03.2025 11:29 CET geschrieben: > Signed-off-by: Alexandre Derumier <alexandre.derumier@groupe-cyllene.com> > --- > PVE/QemuConfig.pm | 4 +- > PVE/QemuServer.pm | 226 +++++++++++++++++++++++++++++++++++++--- > PVE/QemuServer/Drive.pm | 4 + > 3 files changed, 220 insertions(+), 14 deletions(-) > > diff --git a/PVE/QemuConfig.pm b/PVE/QemuConfig.pm > index b60cc398..2b3acb15 100644 > --- a/PVE/QemuConfig.pm > +++ b/PVE/QemuConfig.pm > @@ -377,7 +377,7 @@ sub __snapshot_create_vol_snapshot { > > print "snapshotting '$device' ($drive->{file})\n"; > > - PVE::QemuServer::qemu_volume_snapshot($vmid, $device, $storecfg, $volid, $snapname); > + PVE::QemuServer::qemu_volume_snapshot($vmid, $device, $storecfg, $drive, $snapname); > } > > sub __snapshot_delete_remove_drive { > @@ -414,7 +414,7 @@ sub __snapshot_delete_vol_snapshot { > my $storecfg = PVE::Storage::config(); > my $volid = $drive->{file}; > > - PVE::QemuServer::qemu_volume_snapshot_delete($vmid, $storecfg, $volid, $snapname); > + PVE::QemuServer::qemu_volume_snapshot_delete($vmid, $storecfg, $drive, $snapname); > > push @$unused, $volid; > } > diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm > index 60481acc..6ce3e9c6 100644 > --- a/PVE/QemuServer.pm > +++ b/PVE/QemuServer.pm > @@ -4449,20 +4449,200 @@ sub qemu_block_resize { > } > > sub qemu_volume_snapshot { > - my ($vmid, $deviceid, $storecfg, $volid, $snap) = @_; > + my ($vmid, $deviceid, $storecfg, $drive, $snap) = @_; > > + my $volid = $drive->{file}; > my $running = check_running($vmid); > - > - if ($running && do_snapshots_with_qemu($storecfg, $volid, $deviceid)) { > - mon_cmd($vmid, 'blockdev-snapshot-internal-sync', device => $deviceid, name => $snap); > + my $do_snapshots_with_qemu = do_snapshots_with_qemu($storecfg, $volid, $deviceid) if $running; forbidden syntax > + if ($do_snapshots_with_qemu) { > + if($do_snapshots_with_qemu == 2) { > + my $snapshots = PVE::Storage::volume_snapshot_info($storecfg, $volid); > + my $parent_snap = $snapshots->{'current'}->{parent}; > + my $size = PVE::Storage::volume_size_info($storecfg, $volid, 5); > + blockdev_rename($storecfg, $vmid, $deviceid, $drive, 'current', $snap, $parent_snap); > + blockdev_external_snapshot($storecfg, $vmid, $deviceid, $drive, $snap, $size); > + } else { > + mon_cmd($vmid, 'blockdev-snapshot-internal-sync', device => $deviceid, name => $snap); > + } > } else { > PVE::Storage::volume_snapshot($storecfg, $volid, $snap); > } > } > > +sub blockdev_external_snapshot { > + my ($storecfg, $vmid, $deviceid, $drive, $snap, $size) = @_; > + > + my $volid = $drive->{file}; > + > + #be sure to add drive in write mode > + delete($drive->{ro}); why? > + > + my $new_file_blockdev = generate_file_blockdev($storecfg, $drive); > + my $new_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $new_file_blockdev); > + > + my $snap_file_blockdev = generate_file_blockdev($storecfg, $drive, $snap); > + my $snap_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $snap_file_blockdev, $snap); > + > + #preallocate add a new current file with reference to backing-file > + my ($storeid, $volname) = PVE::Storage::parse_volume_id($volid); > + my $name = (PVE::Storage::parse_volname($storecfg, $volid))[1]; > + PVE::Storage::vdisk_alloc($storecfg, $storeid, $vmid, 'qcow2', $name, $size/1024, $snap_file_blockdev->{filename}); if we instead extend volume_snapshot similarly to what I describe up top (adding a parameter that renaming was already done), we don't need to extend vdisk_alloc's interface like this.. or maybe we could even combine blockdev_rename and blockdev_external_snapshot, to just call PVE::Storage::volume_snapshot to do rename+alloc, and then do the blockdev dance? in any case, this here would be the *only* external caller of vdisk_alloc with a backing file, so I don't think this is the right interface.. > + > + #backing need to be forced to undef in blockdev, to avoid reopen of backing-file on blockdev-add > + $new_fmt_blockdev->{backing} = undef; > + > + PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add', %$new_fmt_blockdev); > + > + mon_cmd($vmid, 'blockdev-snapshot', node => $snap_fmt_blockdev->{'node-name'}, overlay => $new_fmt_blockdev->{'node-name'}); > +} > + > +sub blockdev_delete { > + my ($storecfg, $vmid, $drive, $file_blockdev, $fmt_blockdev) = @_; > + > + #add eval as reopen is auto removing the old nodename automatically only if it was created at vm start in command line argument > + eval { mon_cmd($vmid, 'blockdev-del', 'node-name' => $file_blockdev->{'node-name'}) }; > + eval { mon_cmd($vmid, 'blockdev-del', 'node-name' => $fmt_blockdev->{'node-name'}) }; > + > + #delete the file (don't use vdisk_free as we don't want to delete all snapshot chain) > + print"delete old $file_blockdev->{filename}\n"; > + > + my $storage_name = PVE::Storage::parse_volume_id($drive->{file}); > + my $scfg = $storecfg->{ids}->{$storage_name}; > + if ($scfg->{type} eq 'lvm') { > + PVE::Storage::LVMPlugin::lvremove($file_blockdev->{filename}); > + } else { > + unlink($file_blockdev->{filename}); > + } this really needs to be handled in the storage layer > +} > + > +sub blockdev_rename { > + my ($storecfg, $vmid, $deviceid, $drive, $src_snap, $target_snap, $parent_snap) = @_; > + > + print "rename $src_snap to $target_snap\n"; > + > + my $volid = $drive->{file}; > + > + my $src_file_blockdev = generate_file_blockdev($storecfg, $drive, $src_snap); > + my $src_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $src_file_blockdev, $src_snap); > + my $target_file_blockdev = generate_file_blockdev($storecfg, $drive, $target_snap); > + my $target_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $target_file_blockdev, $target_snap); > + > + #create a hardlink > + link($src_file_blockdev->{filename}, $target_file_blockdev->{filename}); this really needs to be handled in the storage layer > + > + if($target_snap eq 'current' || $src_snap eq 'current') { > + #rename from|to current > + > + #add backing to target > + if ($parent_snap) { > + my $parent_fmt_nodename = encode_nodename('fmt', $volid, $parent_snap); > + $target_fmt_blockdev->{backing} = $parent_fmt_nodename; > + } > + PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add', %$target_fmt_blockdev); > + > + #reopen the current throttlefilter nodename with the target fmt nodename > + my $drive_blockdev = generate_drive_blockdev($storecfg, $vmid, $drive); > + delete $drive_blockdev->{file}; > + $drive_blockdev->{file} = $target_fmt_blockdev->{'node-name'}; these two lines can be a single line > + PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-reopen', options => [$drive_blockdev]); > + } else { > + #intermediate snapshot > + PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-add', %$target_fmt_blockdev); > + > + #reopen the parent node with the new target fmt backing node > + my $parent_file_blockdev = generate_file_blockdev($storecfg, $drive, $parent_snap); > + my $parent_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $parent_file_blockdev, $parent_snap); > + $parent_fmt_blockdev->{backing} = $target_fmt_blockdev->{'node-name'}; > + PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-reopen', options => [$parent_fmt_blockdev]); > + > + #change backing-file in qcow2 metadatas > + PVE::QemuServer::Monitor::mon_cmd($vmid, 'change-backing-file', device => $deviceid, 'image-node-name' => $parent_fmt_blockdev->{'node-name'}, 'backing-file' => $target_file_blockdev->{filename}); > + } > + > + # delete old file|fmt nodes > + # add eval as reopen is auto removing the old nodename automatically only if it was created at vm start in command line argument ugh.. > + eval { PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-del', 'node-name' => $src_file_blockdev->{'node-name'})}; > + eval { PVE::QemuServer::Monitor::mon_cmd($vmid, 'blockdev-del', 'node-name' => $src_fmt_blockdev->{'node-name'})}; > + > + unlink($src_file_blockdev->{filename}); same as above > + > + #rename underlay > + my $storage_name = PVE::Storage::parse_volume_id($volid); > + my $scfg = $storecfg->{ids}->{$storage_name}; > + return if $scfg->{type} ne 'lvm'; > + > + print "rename underlay lvm volume $src_file_blockdev->{filename} to $target_file_blockdev->{filename}\n"; > + PVE::Storage::LVMPlugin::lvrename(undef, $src_file_blockdev->{filename}, $target_file_blockdev->{filename}); absolute no-go, this needs to be handled in the storage layer > +} > + > +sub blockdev_commit { > + my ($storecfg, $vmid, $deviceid, $drive, $src_snap, $target_snap) = @_; > + > + my $volid = $drive->{file}; > + > + print "block-commit $src_snap to base:$target_snap\n"; > + $src_snap = undef if $src_snap && $src_snap eq 'current'; > + > + my $target_file_blockdev = generate_file_blockdev($storecfg, $drive, $target_snap); > + my $target_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $target_file_blockdev, $target_snap); > + > + my $src_file_blockdev = generate_file_blockdev($storecfg, $drive, $src_snap); > + my $src_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $src_file_blockdev, $src_snap); > + > + my $job_id = "commit-$deviceid"; > + my $jobs = {}; > + my $opts = { 'job-id' => $job_id, device => $deviceid }; > + > + my $complete = undef; > + if ($src_snap) { > + $complete = 'auto'; > + $opts->{'top-node'} = $src_fmt_blockdev->{'node-name'}; > + $opts->{'base-node'} = $target_fmt_blockdev->{'node-name'}; > + } else { > + $complete = 'complete'; > + $opts->{'base-node'} = $target_fmt_blockdev->{'node-name'}; > + $opts->{replaces} = $src_fmt_blockdev->{'node-name'}; > + } > + > + mon_cmd($vmid, "block-commit", %$opts); > + $jobs->{$job_id} = {}; > + qemu_drive_mirror_monitor ($vmid, undef, $jobs, $complete, 0, 'commit'); > + > + blockdev_delete($storecfg, $vmid, $drive, $src_file_blockdev, $src_fmt_blockdev); > +} > + > +sub blockdev_stream { > + my ($storecfg, $vmid, $deviceid, $drive, $snap, $parent_snap, $target_snap) = @_; > + > + my $volid = $drive->{file}; > + $target_snap = undef if $target_snap eq 'current'; > + > + my $parent_file_blockdev = generate_file_blockdev($storecfg, $drive, $parent_snap); > + my $parent_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $parent_file_blockdev, $parent_snap); > + > + my $target_file_blockdev = generate_file_blockdev($storecfg, $drive, $target_snap); > + my $target_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $target_file_blockdev, $target_snap); > + > + my $snap_file_blockdev = generate_file_blockdev($storecfg, $drive, $snap); > + my $snap_fmt_blockdev = generate_format_blockdev($storecfg, $drive, $snap_file_blockdev, $snap); > + > + my $job_id = "stream-$deviceid"; > + my $jobs = {}; > + my $options = { 'job-id' => $job_id, device => $target_fmt_blockdev->{'node-name'} }; > + $options->{'base-node'} = $parent_fmt_blockdev->{'node-name'}; > + $options->{'backing-file'} = $parent_file_blockdev->{filename}; > + > + mon_cmd($vmid, 'block-stream', %$options); > + $jobs->{$job_id} = {}; > + qemu_drive_mirror_monitor($vmid, undef, $jobs, 'auto', 0, 'stream'); > + > + blockdev_delete($storecfg, $vmid, $drive, $snap_file_blockdev, $snap_fmt_blockdev); > +} > + > sub qemu_volume_snapshot_delete { > - my ($vmid, $storecfg, $volid, $snap) = @_; > + my ($vmid, $storecfg, $drive, $snap) = @_; > > + my $volid = $drive->{file}; > my $running = check_running($vmid); > my $attached_deviceid; > > @@ -4474,13 +4654,35 @@ sub qemu_volume_snapshot_delete { > }); > } > > - if ($attached_deviceid && do_snapshots_with_qemu($storecfg, $volid, $attached_deviceid)) { > - mon_cmd( > - $vmid, > - 'blockdev-snapshot-delete-internal-sync', > - device => $attached_deviceid, > - name => $snap, > - ); > + my $do_snapshots_with_qemu = do_snapshots_with_qemu($storecfg, $volid, $attached_deviceid) if $running; > + if ($attached_deviceid && $do_snapshots_with_qemu) { > + > + if ($do_snapshots_with_qemu == 2) { > + > + my $path = PVE::Storage::path($storecfg, $volid); > + my $snapshots = PVE::Storage::volume_snapshot_info($storecfg, $volid); > + my $parentsnap = $snapshots->{$snap}->{parent}; > + my $childsnap = $snapshots->{$snap}->{child}; > + > + # if we delete the first snasphot, we commit because the first snapshot original base image, it should be big. > + # improve-me: if firstsnap > child : commit, if firstsnap < child do a stream. > + if(!$parentsnap) { > + print"delete first snapshot $snap\n"; > + blockdev_commit($storecfg, $vmid, $attached_deviceid, $drive, $childsnap, $snap); > + blockdev_rename($storecfg, $vmid, $attached_deviceid, $drive, $snap, $childsnap, $snapshots->{$childsnap}->{child}); > + } else { > + #intermediate snapshot, we always stream the snapshot to child snapshot > + print"stream intermediate snapshot $snap to $childsnap\n"; > + blockdev_stream($storecfg, $vmid, $attached_deviceid, $drive, $snap, $parentsnap, $childsnap); > + } > + } else { > + mon_cmd( > + $vmid, > + 'blockdev-snapshot-delete-internal-sync', > + device => $attached_deviceid, > + name => $snap, > + ); > + } > } else { > PVE::Storage::volume_snapshot_delete( > $storecfg, $volid, $snap, $attached_deviceid ? 1 : undef); > diff --git a/PVE/QemuServer/Drive.pm b/PVE/QemuServer/Drive.pm > index 51513546..7ba401bd 100644 > --- a/PVE/QemuServer/Drive.pm > +++ b/PVE/QemuServer/Drive.pm > @@ -1117,6 +1117,8 @@ sub print_drive_throttle_group { > sub generate_file_blockdev { > my ($storecfg, $drive, $snap, $nodename) = @_; > > + $snap = undef if $snap && $snap eq 'current'; > + > my $volid = $drive->{file}; > my $blockdev = {}; > > @@ -1260,6 +1262,8 @@ sub do_snapshots_with_qemu { > sub generate_format_blockdev { > my ($storecfg, $drive, $file, $snap, $nodename) = @_; > > + $snap = undef if $snap && $snap eq 'current'; > + > my $volid = $drive->{file}; > die "format_blockdev can't be used for nbd" if $volid =~ /^nbd:/; > > -- > 2.39.5 _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel