public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH-SERIES qemu-server] clone_disk-related improvments
@ 2022-01-27 14:01 Fabian Ebner
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 1/4] drive mirror monitor: warn when suspend/resume/freeze/thaw calls fail Fabian Ebner
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Fabian Ebner @ 2022-01-27 14:01 UTC (permalink / raw)
  To: pve-devel

The second patch is fixing a concrete bug with vm_{resume,suspend} not being
issued for a full clone of a running VM without guest agent. Other patches are
minor improvements, mostly trying to prevent future bugs.

Fabian Ebner (4):
  drive mirror monitor: warn when suspend/resume/freeze/thaw calls fail
  api: clone: fork before locking
  api: move disk: fork before locking
  clone disk: don't modify drive parameter

 PVE/API2/Qemu.pm  | 382 ++++++++++++++++++++++++----------------------
 PVE/QemuServer.pm |  10 +-
 2 files changed, 205 insertions(+), 187 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pve-devel] [PATCH qemu-server 1/4] drive mirror monitor: warn when suspend/resume/freeze/thaw calls fail
  2022-01-27 14:01 [pve-devel] [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Ebner
@ 2022-01-27 14:01 ` Fabian Ebner
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking Fabian Ebner
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Fabian Ebner @ 2022-01-27 14:01 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
 PVE/QemuServer.pm | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 0071a06..4f531f2 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -7466,9 +7466,11 @@ sub qemu_drive_mirror_monitor {
 		    if ($agent_running) {
 			print "freeze filesystem\n";
 			eval { mon_cmd($vmid, "guest-fsfreeze-freeze"); };
+			warn $@ if $@;
 		    } else {
 			print "suspend vm\n";
 			eval { PVE::QemuServer::vm_suspend($vmid, 1); };
+			warn $@ if $@;
 		    }
 
 		    # if we clone a disk for a new target vm, we don't switch the disk
@@ -7477,9 +7479,11 @@ sub qemu_drive_mirror_monitor {
 		    if ($agent_running) {
 			print "unfreeze filesystem\n";
 			eval { mon_cmd($vmid, "guest-fsfreeze-thaw"); };
+			warn $@ if $@;
 		    } else {
 			print "resume vm\n";
-			eval {  PVE::QemuServer::vm_resume($vmid, 1, 1); };
+			eval { PVE::QemuServer::vm_resume($vmid, 1, 1); };
+			warn $@ if $@;
 		    }
 
 		    last;
-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking
  2022-01-27 14:01 [pve-devel] [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Ebner
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 1/4] drive mirror monitor: warn when suspend/resume/freeze/thaw calls fail Fabian Ebner
@ 2022-01-27 14:01 ` Fabian Ebner
  2022-01-31 12:34   ` Fabian Grünbichler
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 3/4] api: move disk: " Fabian Ebner
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Fabian Ebner @ 2022-01-27 14:01 UTC (permalink / raw)
  To: pve-devel

using the familiar early+repeated checks pattern from other API calls.
Only intended functional changes are with regard to locking/forking.

For a full clone of a running VM without guest agent, this also fixes
issuing vm_{resume,suspend} calls for drive mirror completion.
Previously, those just timed out, because of not getting the lock:

> create full clone of drive scsi0 (rbdkvm:vm-104-disk-0)
> Formatting '/var/lib/vz/images/105/vm-105-disk-0.raw', fmt=raw
> size=4294967296 preallocation=off
> drive mirror is starting for drive-scsi0
> drive-scsi0: transferred 2.0 MiB of 4.0 GiB (0.05%) in 0s
> drive-scsi0: transferred 635.0 MiB of 4.0 GiB (15.50%) in 1s
> drive-scsi0: transferred 1.6 GiB of 4.0 GiB (40.50%) in 2s
> drive-scsi0: transferred 3.6 GiB of 4.0 GiB (90.23%) in 3s
> drive-scsi0: transferred 4.0 GiB of 4.0 GiB (100.00%) in 4s, ready
> all 'mirror' jobs are ready
> suspend vm
> trying to acquire lock...
> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout
> drive-scsi0: Cancelling block job
> drive-scsi0: Done.
> resume vm
> trying to acquire lock...
> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Best viewed with -w.

 PVE/API2/Qemu.pm | 220 ++++++++++++++++++++++++-----------------------
 1 file changed, 112 insertions(+), 108 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 6992f6f..38e08c8 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -3079,9 +3079,7 @@ __PACKAGE__->register_method({
 
 	my $running = PVE::QemuServer::check_running($vmid) || 0;
 
-	my $clonefn = sub {
-	    # do all tests after lock but before forking worker - if possible
-
+	my $load_and_check = sub {
 	    my $conf = PVE::QemuConfig->load_config($vmid);
 	    PVE::QemuConfig->check_lock($conf);
 
@@ -3091,7 +3089,7 @@ __PACKAGE__->register_method({
 	    die "snapshot '$snapname' does not exist\n"
 		if $snapname && !defined( $conf->{snapshots}->{$snapname});
 
-	    my $full = extract_param($param, 'full') // !PVE::QemuConfig->is_template($conf);
+	    my $full = $param->{full} // !PVE::QemuConfig->is_template($conf);
 
 	    die "parameter 'storage' not allowed for linked clones\n"
 		if defined($storage) && !$full;
@@ -3156,7 +3154,13 @@ __PACKAGE__->register_method({
 		}
 	    }
 
-            # auto generate a new uuid
+	    return ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone);
+	};
+
+	my $clonefn = sub {
+	    my ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone) = $load_and_check->();
+
+	    # auto generate a new uuid
 	    my $smbios1 = PVE::QemuServer::parse_smbios1($newconf->{smbios1} || '');
 	    $smbios1->{uuid} = PVE::QemuServer::generate_uuid();
 	    $newconf->{smbios1} = PVE::QemuServer::print_smbios1($smbios1);
@@ -3181,105 +3185,99 @@ __PACKAGE__->register_method({
 	    # FIXME use PVE::QemuConfig->create_and_lock_config and adapt code
 	    PVE::Tools::file_set_contents($conffile, "# qmclone temporary file\nlock: clone\n");
 
-	    my $realcmd = sub {
-		my $upid = shift;
-
-		my $newvollist = [];
-		my $jobs = {};
-
-		eval {
-		    local $SIG{INT} =
-			local $SIG{TERM} =
-			local $SIG{QUIT} =
-			local $SIG{HUP} = sub { die "interrupted by signal\n"; };
-
-		    PVE::Storage::activate_volumes($storecfg, $vollist, $snapname);
-
-		    my $bwlimit = extract_param($param, 'bwlimit');
-
-		    my $total_jobs = scalar(keys %{$drives});
-		    my $i = 1;
-
-		    foreach my $opt (sort keys %$drives) {
-			my $drive = $drives->{$opt};
-			my $skipcomplete = ($total_jobs != $i); # finish after last drive
-			my $completion = $skipcomplete ? 'skip' : 'complete';
-
-			my $src_sid = PVE::Storage::parse_volume_id($drive->{file});
-			my $storage_list = [ $src_sid ];
-			push @$storage_list, $storage if defined($storage);
-			my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit);
-
-			my $newdrive = PVE::QemuServer::clone_disk(
-			    $storecfg,
-			    $vmid,
-			    $running,
-			    $opt,
-			    $drive,
-			    $snapname,
-			    $newid,
-			    $storage,
-			    $format,
-			    $fullclone->{$opt},
-			    $newvollist,
-			    $jobs,
-			    $completion,
-			    $oldconf->{agent},
-			    $clonelimit,
-			    $oldconf
-			);
-
-			$newconf->{$opt} = PVE::QemuServer::print_drive($newdrive);
-
-			PVE::QemuConfig->write_config($newid, $newconf);
-			$i++;
-		    }
-
-		    delete $newconf->{lock};
-
-		    # do not write pending changes
-		    if (my @changes = keys %{$newconf->{pending}}) {
-			my $pending = join(',', @changes);
-			warn "found pending changes for '$pending', discarding for clone\n";
-			delete $newconf->{pending};
-		    }
-
-		    PVE::QemuConfig->write_config($newid, $newconf);
-
-                    if ($target) {
-			# always deactivate volumes - avoid lvm LVs to be active on several nodes
-			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
-			PVE::Storage::deactivate_volumes($storecfg, $newvollist);
-
-			my $newconffile = PVE::QemuConfig->config_file($newid, $target);
-			die "Failed to move config to node '$target' - rename failed: $!\n"
-			    if !rename($conffile, $newconffile);
-		    }
-
-		    PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool;
-		};
-		if (my $err = $@) {
-		    eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) };
-		    sleep 1; # some storage like rbd need to wait before release volume - really?
-
-		    foreach my $volid (@$newvollist) {
-			eval { PVE::Storage::vdisk_free($storecfg, $volid); };
-			warn $@ if $@;
-		    }
-
-		    PVE::Firewall::remove_vmfw_conf($newid);
-
-		    unlink $conffile; # avoid races -> last thing before die
-
-		    die "clone failed: $err";
-		}
-
-		return;
-	    };
-
 	    PVE::Firewall::clone_vmfw_conf($vmid, $newid);
 
-	    return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $realcmd);
+	    my $newvollist = [];
+	    my $jobs = {};
+
+	    eval {
+		local $SIG{INT} =
+		    local $SIG{TERM} =
+		    local $SIG{QUIT} =
+		    local $SIG{HUP} = sub { die "interrupted by signal\n"; };
+
+		PVE::Storage::activate_volumes($storecfg, $vollist, $snapname);
+
+		my $bwlimit = extract_param($param, 'bwlimit');
+
+		my $total_jobs = scalar(keys %{$drives});
+		my $i = 1;
+
+		foreach my $opt (sort keys %$drives) {
+		    my $drive = $drives->{$opt};
+		    my $skipcomplete = ($total_jobs != $i); # finish after last drive
+		    my $completion = $skipcomplete ? 'skip' : 'complete';
+
+		    my $src_sid = PVE::Storage::parse_volume_id($drive->{file});
+		    my $storage_list = [ $src_sid ];
+		    push @$storage_list, $storage if defined($storage);
+		    my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit);
+
+		    my $newdrive = PVE::QemuServer::clone_disk(
+			$storecfg,
+			$vmid,
+			$running,
+			$opt,
+			$drive,
+			$snapname,
+			$newid,
+			$storage,
+			$format,
+			$fullclone->{$opt},
+			$newvollist,
+			$jobs,
+			$completion,
+			$oldconf->{agent},
+			$clonelimit,
+			$oldconf
+		    );
+
+		    $newconf->{$opt} = PVE::QemuServer::print_drive($newdrive);
+
+		    PVE::QemuConfig->write_config($newid, $newconf);
+		    $i++;
+		}
+
+		delete $newconf->{lock};
+
+		# do not write pending changes
+		if (my @changes = keys %{$newconf->{pending}}) {
+		    my $pending = join(',', @changes);
+		    warn "found pending changes for '$pending', discarding for clone\n";
+		    delete $newconf->{pending};
+		}
+
+		PVE::QemuConfig->write_config($newid, $newconf);
+
+		if ($target) {
+		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
+		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
+		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);
+
+		    my $newconffile = PVE::QemuConfig->config_file($newid, $target);
+		    die "Failed to move config to node '$target' - rename failed: $!\n"
+			if !rename($conffile, $newconffile);
+		}
+
+		PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool;
+	    };
+	    if (my $err = $@) {
+		eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) };
+		sleep 1; # some storage like rbd need to wait before release volume - really?
+
+		foreach my $volid (@$newvollist) {
+		    eval { PVE::Storage::vdisk_free($storecfg, $volid); };
+		    warn $@ if $@;
+		}
+
+		PVE::Firewall::remove_vmfw_conf($newid);
+
+		unlink $conffile; # avoid races -> last thing before die
+
+		die "clone failed: $err";
+	    }
+
+	    return;
 	};
 
 	# Aquire exclusive lock lock for $newid
@@ -3287,12 +3285,18 @@ __PACKAGE__->register_method({
 	    return PVE::QemuConfig->lock_config_full($newid, 1, $clonefn);
 	};
 
-	# exclusive lock if VM is running - else shared lock is enough;
-	if ($running) {
-	    return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm);
-	} else {
-	    return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm);
-	}
+	my $lock_source_vm = sub {
+	    # exclusive lock if VM is running - else shared lock is enough;
+	    if ($running) {
+		return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm);
+	    } else {
+		return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm);
+	    }
+	};
+
+	$load_and_check->(); # early checks before forking/locking
+
+	return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $lock_source_vm);
     }});
 
 __PACKAGE__->register_method({
-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pve-devel] [PATCH qemu-server 3/4] api: move disk: fork before locking
  2022-01-27 14:01 [pve-devel] [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Ebner
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 1/4] drive mirror monitor: warn when suspend/resume/freeze/thaw calls fail Fabian Ebner
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking Fabian Ebner
@ 2022-01-27 14:01 ` Fabian Ebner
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 4/4] clone disk: don't modify drive parameter Fabian Ebner
  2022-02-08  8:36 ` [pve-devel] applied-series: [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Grünbichler
  4 siblings, 0 replies; 9+ messages in thread
From: Fabian Ebner @ 2022-01-27 14:01 UTC (permalink / raw)
  To: pve-devel

using the familiar early+repeated checks pattern from other API calls.
Only intended functional changes are with regard to locking/forking.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Best viewed with -w

 PVE/API2/Qemu.pm | 162 +++++++++++++++++++++++++----------------------
 1 file changed, 86 insertions(+), 76 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 38e08c8..59e083e 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -3396,7 +3396,7 @@ __PACKAGE__->register_method({
 
 	my $storecfg = PVE::Storage::config();
 
-	my $move_updatefn =  sub {
+	my $load_and_check_move = sub {
 	    my $conf = PVE::QemuConfig->load_config($vmid);
 	    PVE::QemuConfig->check_lock($conf);
 
@@ -3416,8 +3416,8 @@ __PACKAGE__->register_method({
 		$oldfmt = $1;
 	    }
 
-	    die "you can't move to the same storage with same format\n" if $oldstoreid eq $storeid &&
-                (!$format || !$oldfmt || $oldfmt eq $format);
+	    die "you can't move to the same storage with same format\n"
+		if $oldstoreid eq $storeid && (!$format || !$oldfmt || $oldfmt eq $format);
 
 	    # this only checks snapshots because $disk is passed!
 	    my $snapshotted = PVE::QemuServer::Drive::is_volume_in_use(
@@ -3429,6 +3429,13 @@ __PACKAGE__->register_method({
 	    die "you can't move a disk with snapshots and delete the source\n"
 		if $snapshotted && $param->{delete};
 
+	    return ($conf, $drive, $oldstoreid, $snapshotted);
+	};
+
+	my $move_updatefn = sub {
+	    my ($conf, $drive, $oldstoreid, $snapshotted) = $load_and_check_move->();
+	    my $old_volid = $drive->{file};
+
 	    PVE::Cluster::log_msg(
 		'info',
 		$authuser,
@@ -3439,83 +3446,79 @@ __PACKAGE__->register_method({
 
 	    PVE::Storage::activate_volumes($storecfg, [ $drive->{file} ]);
 
-	    my $realcmd = sub {
-		my $newvollist = [];
+	    my $newvollist = [];
+
+	    eval {
+		local $SIG{INT} =
+		    local $SIG{TERM} =
+		    local $SIG{QUIT} =
+		    local $SIG{HUP} = sub { die "interrupted by signal\n"; };
+
+		warn "moving disk with snapshots, snapshots will not be moved!\n"
+		    if $snapshotted;
+
+		my $bwlimit = extract_param($param, 'bwlimit');
+		my $movelimit = PVE::Storage::get_bandwidth_limit(
+		    'move',
+		    [$oldstoreid, $storeid],
+		    $bwlimit
+		);
+
+		my $newdrive = PVE::QemuServer::clone_disk(
+		    $storecfg,
+		    $vmid,
+		    $running,
+		    $disk,
+		    $drive,
+		    undef,
+		    $vmid,
+		    $storeid,
+		    $format,
+		    1,
+		    $newvollist,
+		    undef,
+		    undef,
+		    undef,
+		    $movelimit,
+		    $conf,
+		);
+		$conf->{$disk} = PVE::QemuServer::print_drive($newdrive);
+
+		PVE::QemuConfig->add_unused_volume($conf, $old_volid) if !$param->{delete};
+
+		# convert moved disk to base if part of template
+		PVE::QemuServer::template_create($vmid, $conf, $disk)
+		    if PVE::QemuConfig->is_template($conf);
+
+		PVE::QemuConfig->write_config($vmid, $conf);
+
+		my $do_trim = PVE::QemuServer::get_qga_key($conf, 'fstrim_cloned_disks');
+		if ($running && $do_trim && PVE::QemuServer::qga_check_running($vmid)) {
+		    eval { mon_cmd($vmid, "guest-fstrim") };
+		}
 
 		eval {
-		    local $SIG{INT} =
-			local $SIG{TERM} =
-			local $SIG{QUIT} =
-			local $SIG{HUP} = sub { die "interrupted by signal\n"; };
-
-		    warn "moving disk with snapshots, snapshots will not be moved!\n"
-			if $snapshotted;
-
-		    my $bwlimit = extract_param($param, 'bwlimit');
-		    my $movelimit = PVE::Storage::get_bandwidth_limit(
-			'move',
-			[$oldstoreid, $storeid],
-			$bwlimit
-		    );
-
-		    my $newdrive = PVE::QemuServer::clone_disk(
-			$storecfg,
-			$vmid,
-			$running,
-			$disk,
-			$drive,
-			undef,
-			$vmid,
-			$storeid,
-			$format,
-			1,
-			$newvollist,
-			undef,
-			undef,
-			undef,
-			$movelimit,
-			$conf,
-		    );
-		    $conf->{$disk} = PVE::QemuServer::print_drive($newdrive);
-
-		    PVE::QemuConfig->add_unused_volume($conf, $old_volid) if !$param->{delete};
-
-		    # convert moved disk to base if part of template
-		    PVE::QemuServer::template_create($vmid, $conf, $disk)
-			if PVE::QemuConfig->is_template($conf);
-
-		    PVE::QemuConfig->write_config($vmid, $conf);
-
-		    my $do_trim = PVE::QemuServer::get_qga_key($conf, 'fstrim_cloned_disks');
-		    if ($running && $do_trim && PVE::QemuServer::qga_check_running($vmid)) {
-			eval { mon_cmd($vmid, "guest-fstrim") };
-		    }
-
-		    eval {
-			# try to deactivate volumes - avoid lvm LVs to be active on several nodes
-			PVE::Storage::deactivate_volumes($storecfg, [ $newdrive->{file} ])
-			    if !$running;
-		    };
-		    warn $@ if $@;
+		    # try to deactivate volumes - avoid lvm LVs to be active on several nodes
+		    PVE::Storage::deactivate_volumes($storecfg, [ $newdrive->{file} ])
+			if !$running;
 		};
-		if (my $err = $@) {
-		    foreach my $volid (@$newvollist) {
-			eval { PVE::Storage::vdisk_free($storecfg, $volid) };
-			warn $@ if $@;
-		    }
-		    die "storage migration failed: $err";
-                }
-
-		if ($param->{delete}) {
-		    eval {
-			PVE::Storage::deactivate_volumes($storecfg, [$old_volid]);
-			PVE::Storage::vdisk_free($storecfg, $old_volid);
-		    };
-		    warn $@ if $@;
-		}
+		warn $@ if $@;
 	    };
+	    if (my $err = $@) {
+		foreach my $volid (@$newvollist) {
+		    eval { PVE::Storage::vdisk_free($storecfg, $volid) };
+		    warn $@ if $@;
+		}
+		die "storage migration failed: $err";
+	    }
 
-            return $rpcenv->fork_worker('qmmove', $vmid, $authuser, $realcmd);
+	    if ($param->{delete}) {
+		eval {
+		    PVE::Storage::deactivate_volumes($storecfg, [$old_volid]);
+		    PVE::Storage::vdisk_free($storecfg, $old_volid);
+		};
+		warn $@ if $@;
+	    }
 	};
 
 	my $load_and_check_reassign_configs = sub {
@@ -3695,7 +3698,14 @@ __PACKAGE__->register_method({
 
 	    die "cannot move disk '$disk', only configured disks can be moved to another storage\n"
 		if $disk =~ m/^unused\d+$/;
-	    return PVE::QemuConfig->lock_config($vmid, $move_updatefn);
+
+	    $load_and_check_move->(); # early checks before forking/locking
+
+	    my $realcmd = sub {
+		PVE::QemuConfig->lock_config($vmid, $move_updatefn);
+	    };
+
+	    return $rpcenv->fork_worker('qmmove', $vmid, $authuser, $realcmd);
 	} else {
 	    my $msg = "both 'storage' and 'target-vmid' missing, either needs to be set";
 	    raise_param_exc({ 'target-vmid' => $msg, 'storage' => $msg });
-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pve-devel] [PATCH qemu-server 4/4] clone disk: don't modify drive parameter
  2022-01-27 14:01 [pve-devel] [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Ebner
                   ` (2 preceding siblings ...)
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 3/4] api: move disk: " Fabian Ebner
@ 2022-01-27 14:01 ` Fabian Ebner
  2022-02-08  8:36 ` [pve-devel] applied-series: [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Grünbichler
  4 siblings, 0 replies; 9+ messages in thread
From: Fabian Ebner @ 2022-01-27 14:01 UTC (permalink / raw)
  To: pve-devel

While existing callers are not using the parameter after the call,
the modification is rather unexpected and could lead to bugs quickly.

Also avoid setting an undef value in the hash, but use delete instead.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
 PVE/QemuServer.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 4f531f2..8aa1946 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -7642,8 +7642,8 @@ sub clone_disk {
 no_data_clone:
     my ($size) = eval { PVE::Storage::volume_size_info($storecfg, $newvolid, 10) };
 
-    my $disk = $drive;
-    $disk->{format} = undef;
+    my $disk = dclone($drive);
+    delete $disk->{format};
     $disk->{file} = $newvolid;
     $disk->{size} = $size if defined($size);
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking Fabian Ebner
@ 2022-01-31 12:34   ` Fabian Grünbichler
  2022-02-03  9:31     ` Fabian Ebner
  0 siblings, 1 reply; 9+ messages in thread
From: Fabian Grünbichler @ 2022-01-31 12:34 UTC (permalink / raw)
  To: Proxmox VE development discussion

On January 27, 2022 3:01 pm, Fabian Ebner wrote:
> using the familiar early+repeated checks pattern from other API calls.
> Only intended functional changes are with regard to locking/forking.

two questions:
- the FW config cloning happens inside the worker now, while it was 
  previously before forking the worker (LGTM, but might be called out 
  explicitly if intentional ;))
- there are some checks at the start of the endpoint (checking 
  storage/target), which are not repeated after the fork+lock - while 
  unlikely, our view of storage.cfg could change in-between (lock guest 
  config -> cfs_update). should those be moved in the check sub (or into 
  the check_storage_access_clone helper)?

rest of the series LGTM

> 
> For a full clone of a running VM without guest agent, this also fixes
> issuing vm_{resume,suspend} calls for drive mirror completion.
> Previously, those just timed out, because of not getting the lock:
> 
>> create full clone of drive scsi0 (rbdkvm:vm-104-disk-0)
>> Formatting '/var/lib/vz/images/105/vm-105-disk-0.raw', fmt=raw
>> size=4294967296 preallocation=off
>> drive mirror is starting for drive-scsi0
>> drive-scsi0: transferred 2.0 MiB of 4.0 GiB (0.05%) in 0s
>> drive-scsi0: transferred 635.0 MiB of 4.0 GiB (15.50%) in 1s
>> drive-scsi0: transferred 1.6 GiB of 4.0 GiB (40.50%) in 2s
>> drive-scsi0: transferred 3.6 GiB of 4.0 GiB (90.23%) in 3s
>> drive-scsi0: transferred 4.0 GiB of 4.0 GiB (100.00%) in 4s, ready
>> all 'mirror' jobs are ready
>> suspend vm
>> trying to acquire lock...
>> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout
>> drive-scsi0: Cancelling block job
>> drive-scsi0: Done.
>> resume vm
>> trying to acquire lock...
>> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout
> 
> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
> ---
> 
> Best viewed with -w.
> 
>  PVE/API2/Qemu.pm | 220 ++++++++++++++++++++++++-----------------------
>  1 file changed, 112 insertions(+), 108 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 6992f6f..38e08c8 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -3079,9 +3079,7 @@ __PACKAGE__->register_method({
>  
>  	my $running = PVE::QemuServer::check_running($vmid) || 0;
>  
> -	my $clonefn = sub {
> -	    # do all tests after lock but before forking worker - if possible
> -
> +	my $load_and_check = sub {
>  	    my $conf = PVE::QemuConfig->load_config($vmid);
>  	    PVE::QemuConfig->check_lock($conf);
>  
> @@ -3091,7 +3089,7 @@ __PACKAGE__->register_method({
>  	    die "snapshot '$snapname' does not exist\n"
>  		if $snapname && !defined( $conf->{snapshots}->{$snapname});
>  
> -	    my $full = extract_param($param, 'full') // !PVE::QemuConfig->is_template($conf);
> +	    my $full = $param->{full} // !PVE::QemuConfig->is_template($conf);
>  
>  	    die "parameter 'storage' not allowed for linked clones\n"
>  		if defined($storage) && !$full;
> @@ -3156,7 +3154,13 @@ __PACKAGE__->register_method({
>  		}
>  	    }
>  
> -            # auto generate a new uuid
> +	    return ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone);
> +	};
> +
> +	my $clonefn = sub {
> +	    my ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone) = $load_and_check->();
> +
> +	    # auto generate a new uuid
>  	    my $smbios1 = PVE::QemuServer::parse_smbios1($newconf->{smbios1} || '');
>  	    $smbios1->{uuid} = PVE::QemuServer::generate_uuid();
>  	    $newconf->{smbios1} = PVE::QemuServer::print_smbios1($smbios1);
> @@ -3181,105 +3185,99 @@ __PACKAGE__->register_method({
>  	    # FIXME use PVE::QemuConfig->create_and_lock_config and adapt code
>  	    PVE::Tools::file_set_contents($conffile, "# qmclone temporary file\nlock: clone\n");
>  
> -	    my $realcmd = sub {
> -		my $upid = shift;
> -
> -		my $newvollist = [];
> -		my $jobs = {};
> -
> -		eval {
> -		    local $SIG{INT} =
> -			local $SIG{TERM} =
> -			local $SIG{QUIT} =
> -			local $SIG{HUP} = sub { die "interrupted by signal\n"; };
> -
> -		    PVE::Storage::activate_volumes($storecfg, $vollist, $snapname);
> -
> -		    my $bwlimit = extract_param($param, 'bwlimit');
> -
> -		    my $total_jobs = scalar(keys %{$drives});
> -		    my $i = 1;
> -
> -		    foreach my $opt (sort keys %$drives) {
> -			my $drive = $drives->{$opt};
> -			my $skipcomplete = ($total_jobs != $i); # finish after last drive
> -			my $completion = $skipcomplete ? 'skip' : 'complete';
> -
> -			my $src_sid = PVE::Storage::parse_volume_id($drive->{file});
> -			my $storage_list = [ $src_sid ];
> -			push @$storage_list, $storage if defined($storage);
> -			my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit);
> -
> -			my $newdrive = PVE::QemuServer::clone_disk(
> -			    $storecfg,
> -			    $vmid,
> -			    $running,
> -			    $opt,
> -			    $drive,
> -			    $snapname,
> -			    $newid,
> -			    $storage,
> -			    $format,
> -			    $fullclone->{$opt},
> -			    $newvollist,
> -			    $jobs,
> -			    $completion,
> -			    $oldconf->{agent},
> -			    $clonelimit,
> -			    $oldconf
> -			);
> -
> -			$newconf->{$opt} = PVE::QemuServer::print_drive($newdrive);
> -
> -			PVE::QemuConfig->write_config($newid, $newconf);
> -			$i++;
> -		    }
> -
> -		    delete $newconf->{lock};
> -
> -		    # do not write pending changes
> -		    if (my @changes = keys %{$newconf->{pending}}) {
> -			my $pending = join(',', @changes);
> -			warn "found pending changes for '$pending', discarding for clone\n";
> -			delete $newconf->{pending};
> -		    }
> -
> -		    PVE::QemuConfig->write_config($newid, $newconf);
> -
> -                    if ($target) {
> -			# always deactivate volumes - avoid lvm LVs to be active on several nodes
> -			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
> -			PVE::Storage::deactivate_volumes($storecfg, $newvollist);
> -
> -			my $newconffile = PVE::QemuConfig->config_file($newid, $target);
> -			die "Failed to move config to node '$target' - rename failed: $!\n"
> -			    if !rename($conffile, $newconffile);
> -		    }
> -
> -		    PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool;
> -		};
> -		if (my $err = $@) {
> -		    eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) };
> -		    sleep 1; # some storage like rbd need to wait before release volume - really?
> -
> -		    foreach my $volid (@$newvollist) {
> -			eval { PVE::Storage::vdisk_free($storecfg, $volid); };
> -			warn $@ if $@;
> -		    }
> -
> -		    PVE::Firewall::remove_vmfw_conf($newid);
> -
> -		    unlink $conffile; # avoid races -> last thing before die
> -
> -		    die "clone failed: $err";
> -		}
> -
> -		return;
> -	    };
> -
>  	    PVE::Firewall::clone_vmfw_conf($vmid, $newid);
>  
> -	    return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $realcmd);
> +	    my $newvollist = [];
> +	    my $jobs = {};
> +
> +	    eval {
> +		local $SIG{INT} =
> +		    local $SIG{TERM} =
> +		    local $SIG{QUIT} =
> +		    local $SIG{HUP} = sub { die "interrupted by signal\n"; };
> +
> +		PVE::Storage::activate_volumes($storecfg, $vollist, $snapname);
> +
> +		my $bwlimit = extract_param($param, 'bwlimit');
> +
> +		my $total_jobs = scalar(keys %{$drives});
> +		my $i = 1;
> +
> +		foreach my $opt (sort keys %$drives) {
> +		    my $drive = $drives->{$opt};
> +		    my $skipcomplete = ($total_jobs != $i); # finish after last drive
> +		    my $completion = $skipcomplete ? 'skip' : 'complete';
> +
> +		    my $src_sid = PVE::Storage::parse_volume_id($drive->{file});
> +		    my $storage_list = [ $src_sid ];
> +		    push @$storage_list, $storage if defined($storage);
> +		    my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit);
> +
> +		    my $newdrive = PVE::QemuServer::clone_disk(
> +			$storecfg,
> +			$vmid,
> +			$running,
> +			$opt,
> +			$drive,
> +			$snapname,
> +			$newid,
> +			$storage,
> +			$format,
> +			$fullclone->{$opt},
> +			$newvollist,
> +			$jobs,
> +			$completion,
> +			$oldconf->{agent},
> +			$clonelimit,
> +			$oldconf
> +		    );
> +
> +		    $newconf->{$opt} = PVE::QemuServer::print_drive($newdrive);
> +
> +		    PVE::QemuConfig->write_config($newid, $newconf);
> +		    $i++;
> +		}
> +
> +		delete $newconf->{lock};
> +
> +		# do not write pending changes
> +		if (my @changes = keys %{$newconf->{pending}}) {
> +		    my $pending = join(',', @changes);
> +		    warn "found pending changes for '$pending', discarding for clone\n";
> +		    delete $newconf->{pending};
> +		}
> +
> +		PVE::QemuConfig->write_config($newid, $newconf);
> +
> +		if ($target) {
> +		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
> +		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
> +		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);
> +
> +		    my $newconffile = PVE::QemuConfig->config_file($newid, $target);
> +		    die "Failed to move config to node '$target' - rename failed: $!\n"
> +			if !rename($conffile, $newconffile);
> +		}
> +
> +		PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool;
> +	    };
> +	    if (my $err = $@) {
> +		eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) };
> +		sleep 1; # some storage like rbd need to wait before release volume - really?
> +
> +		foreach my $volid (@$newvollist) {
> +		    eval { PVE::Storage::vdisk_free($storecfg, $volid); };
> +		    warn $@ if $@;
> +		}
> +
> +		PVE::Firewall::remove_vmfw_conf($newid);
> +
> +		unlink $conffile; # avoid races -> last thing before die
> +
> +		die "clone failed: $err";
> +	    }
> +
> +	    return;
>  	};
>  
>  	# Aquire exclusive lock lock for $newid
> @@ -3287,12 +3285,18 @@ __PACKAGE__->register_method({
>  	    return PVE::QemuConfig->lock_config_full($newid, 1, $clonefn);
>  	};
>  
> -	# exclusive lock if VM is running - else shared lock is enough;
> -	if ($running) {
> -	    return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm);
> -	} else {
> -	    return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm);
> -	}
> +	my $lock_source_vm = sub {
> +	    # exclusive lock if VM is running - else shared lock is enough;
> +	    if ($running) {
> +		return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm);
> +	    } else {
> +		return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm);
> +	    }
> +	};
> +
> +	$load_and_check->(); # early checks before forking/locking
> +
> +	return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $lock_source_vm);
>      }});
>  
>  __PACKAGE__->register_method({
> -- 
> 2.30.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking
  2022-01-31 12:34   ` Fabian Grünbichler
@ 2022-02-03  9:31     ` Fabian Ebner
  2022-02-03 12:49       ` Fabian Grünbichler
  0 siblings, 1 reply; 9+ messages in thread
From: Fabian Ebner @ 2022-02-03  9:31 UTC (permalink / raw)
  To: pve-devel, Fabian Grünbichler

Am 31.01.22 um 13:34 schrieb Fabian Grünbichler:
> On January 27, 2022 3:01 pm, Fabian Ebner wrote:
>> using the familiar early+repeated checks pattern from other API calls.
>> Only intended functional changes are with regard to locking/forking.
> 
> two questions:
> - the FW config cloning happens inside the worker now, while it was 
>   previously before forking the worker (LGTM, but might be called out 
>   explicitly if intentional ;))

Honestly, I didn't think too much about it, so thanks for pointing that
out! But thinking about it now, I also don't see an obvious issue with
it and IMHO it feels more natural to be part of the worker since it
takes the firewall config lock and the cleanup also happens inside the
worker.

> - there are some checks at the start of the endpoint (checking 
>   storage/target), which are not repeated after the fork+lock - while 
>   unlikely, our view of storage.cfg could change in-between (lock guest 
>   config -> cfs_update). should those be moved in the check sub (or into 
>   the check_storage_access_clone helper)?
> 

Yes, for better consistency that should be done. Either way is fine with
me. Should I send a v2 or are you going to do a follow-up?

> rest of the series LGTM
> 
>>
>> For a full clone of a running VM without guest agent, this also fixes
>> issuing vm_{resume,suspend} calls for drive mirror completion.
>> Previously, those just timed out, because of not getting the lock:
>>
>>> create full clone of drive scsi0 (rbdkvm:vm-104-disk-0)
>>> Formatting '/var/lib/vz/images/105/vm-105-disk-0.raw', fmt=raw
>>> size=4294967296 preallocation=off
>>> drive mirror is starting for drive-scsi0
>>> drive-scsi0: transferred 2.0 MiB of 4.0 GiB (0.05%) in 0s
>>> drive-scsi0: transferred 635.0 MiB of 4.0 GiB (15.50%) in 1s
>>> drive-scsi0: transferred 1.6 GiB of 4.0 GiB (40.50%) in 2s
>>> drive-scsi0: transferred 3.6 GiB of 4.0 GiB (90.23%) in 3s
>>> drive-scsi0: transferred 4.0 GiB of 4.0 GiB (100.00%) in 4s, ready
>>> all 'mirror' jobs are ready
>>> suspend vm
>>> trying to acquire lock...
>>> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout
>>> drive-scsi0: Cancelling block job
>>> drive-scsi0: Done.
>>> resume vm
>>> trying to acquire lock...
>>> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout
>>
>> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>> ---
>>
>> Best viewed with -w.
>>
>>  PVE/API2/Qemu.pm | 220 ++++++++++++++++++++++++-----------------------
>>  1 file changed, 112 insertions(+), 108 deletions(-)
>>
>> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>> index 6992f6f..38e08c8 100644
>> --- a/PVE/API2/Qemu.pm
>> +++ b/PVE/API2/Qemu.pm
>> @@ -3079,9 +3079,7 @@ __PACKAGE__->register_method({
>>  
>>  	my $running = PVE::QemuServer::check_running($vmid) || 0;
>>  
>> -	my $clonefn = sub {
>> -	    # do all tests after lock but before forking worker - if possible
>> -
>> +	my $load_and_check = sub {
>>  	    my $conf = PVE::QemuConfig->load_config($vmid);
>>  	    PVE::QemuConfig->check_lock($conf);
>>  
>> @@ -3091,7 +3089,7 @@ __PACKAGE__->register_method({
>>  	    die "snapshot '$snapname' does not exist\n"
>>  		if $snapname && !defined( $conf->{snapshots}->{$snapname});
>>  
>> -	    my $full = extract_param($param, 'full') // !PVE::QemuConfig->is_template($conf);
>> +	    my $full = $param->{full} // !PVE::QemuConfig->is_template($conf);
>>  
>>  	    die "parameter 'storage' not allowed for linked clones\n"
>>  		if defined($storage) && !$full;
>> @@ -3156,7 +3154,13 @@ __PACKAGE__->register_method({
>>  		}
>>  	    }
>>  
>> -            # auto generate a new uuid
>> +	    return ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone);
>> +	};
>> +
>> +	my $clonefn = sub {
>> +	    my ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone) = $load_and_check->();
>> +
>> +	    # auto generate a new uuid
>>  	    my $smbios1 = PVE::QemuServer::parse_smbios1($newconf->{smbios1} || '');
>>  	    $smbios1->{uuid} = PVE::QemuServer::generate_uuid();
>>  	    $newconf->{smbios1} = PVE::QemuServer::print_smbios1($smbios1);
>> @@ -3181,105 +3185,99 @@ __PACKAGE__->register_method({
>>  	    # FIXME use PVE::QemuConfig->create_and_lock_config and adapt code
>>  	    PVE::Tools::file_set_contents($conffile, "# qmclone temporary file\nlock: clone\n");
>>  
>> -	    my $realcmd = sub {
>> -		my $upid = shift;
>> -
>> -		my $newvollist = [];
>> -		my $jobs = {};
>> -
>> -		eval {
>> -		    local $SIG{INT} =
>> -			local $SIG{TERM} =
>> -			local $SIG{QUIT} =
>> -			local $SIG{HUP} = sub { die "interrupted by signal\n"; };
>> -
>> -		    PVE::Storage::activate_volumes($storecfg, $vollist, $snapname);
>> -
>> -		    my $bwlimit = extract_param($param, 'bwlimit');
>> -
>> -		    my $total_jobs = scalar(keys %{$drives});
>> -		    my $i = 1;
>> -
>> -		    foreach my $opt (sort keys %$drives) {
>> -			my $drive = $drives->{$opt};
>> -			my $skipcomplete = ($total_jobs != $i); # finish after last drive
>> -			my $completion = $skipcomplete ? 'skip' : 'complete';
>> -
>> -			my $src_sid = PVE::Storage::parse_volume_id($drive->{file});
>> -			my $storage_list = [ $src_sid ];
>> -			push @$storage_list, $storage if defined($storage);
>> -			my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit);
>> -
>> -			my $newdrive = PVE::QemuServer::clone_disk(
>> -			    $storecfg,
>> -			    $vmid,
>> -			    $running,
>> -			    $opt,
>> -			    $drive,
>> -			    $snapname,
>> -			    $newid,
>> -			    $storage,
>> -			    $format,
>> -			    $fullclone->{$opt},
>> -			    $newvollist,
>> -			    $jobs,
>> -			    $completion,
>> -			    $oldconf->{agent},
>> -			    $clonelimit,
>> -			    $oldconf
>> -			);
>> -
>> -			$newconf->{$opt} = PVE::QemuServer::print_drive($newdrive);
>> -
>> -			PVE::QemuConfig->write_config($newid, $newconf);
>> -			$i++;
>> -		    }
>> -
>> -		    delete $newconf->{lock};
>> -
>> -		    # do not write pending changes
>> -		    if (my @changes = keys %{$newconf->{pending}}) {
>> -			my $pending = join(',', @changes);
>> -			warn "found pending changes for '$pending', discarding for clone\n";
>> -			delete $newconf->{pending};
>> -		    }
>> -
>> -		    PVE::QemuConfig->write_config($newid, $newconf);
>> -
>> -                    if ($target) {
>> -			# always deactivate volumes - avoid lvm LVs to be active on several nodes
>> -			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>> -			PVE::Storage::deactivate_volumes($storecfg, $newvollist);
>> -
>> -			my $newconffile = PVE::QemuConfig->config_file($newid, $target);
>> -			die "Failed to move config to node '$target' - rename failed: $!\n"
>> -			    if !rename($conffile, $newconffile);
>> -		    }
>> -
>> -		    PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool;
>> -		};
>> -		if (my $err = $@) {
>> -		    eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) };
>> -		    sleep 1; # some storage like rbd need to wait before release volume - really?
>> -
>> -		    foreach my $volid (@$newvollist) {
>> -			eval { PVE::Storage::vdisk_free($storecfg, $volid); };
>> -			warn $@ if $@;
>> -		    }
>> -
>> -		    PVE::Firewall::remove_vmfw_conf($newid);
>> -
>> -		    unlink $conffile; # avoid races -> last thing before die
>> -
>> -		    die "clone failed: $err";
>> -		}
>> -
>> -		return;
>> -	    };
>> -
>>  	    PVE::Firewall::clone_vmfw_conf($vmid, $newid);
>>  
>> -	    return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $realcmd);
>> +	    my $newvollist = [];
>> +	    my $jobs = {};
>> +
>> +	    eval {
>> +		local $SIG{INT} =
>> +		    local $SIG{TERM} =
>> +		    local $SIG{QUIT} =
>> +		    local $SIG{HUP} = sub { die "interrupted by signal\n"; };
>> +
>> +		PVE::Storage::activate_volumes($storecfg, $vollist, $snapname);
>> +
>> +		my $bwlimit = extract_param($param, 'bwlimit');
>> +
>> +		my $total_jobs = scalar(keys %{$drives});
>> +		my $i = 1;
>> +
>> +		foreach my $opt (sort keys %$drives) {
>> +		    my $drive = $drives->{$opt};
>> +		    my $skipcomplete = ($total_jobs != $i); # finish after last drive
>> +		    my $completion = $skipcomplete ? 'skip' : 'complete';
>> +
>> +		    my $src_sid = PVE::Storage::parse_volume_id($drive->{file});
>> +		    my $storage_list = [ $src_sid ];
>> +		    push @$storage_list, $storage if defined($storage);
>> +		    my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit);
>> +
>> +		    my $newdrive = PVE::QemuServer::clone_disk(
>> +			$storecfg,
>> +			$vmid,
>> +			$running,
>> +			$opt,
>> +			$drive,
>> +			$snapname,
>> +			$newid,
>> +			$storage,
>> +			$format,
>> +			$fullclone->{$opt},
>> +			$newvollist,
>> +			$jobs,
>> +			$completion,
>> +			$oldconf->{agent},
>> +			$clonelimit,
>> +			$oldconf
>> +		    );
>> +
>> +		    $newconf->{$opt} = PVE::QemuServer::print_drive($newdrive);
>> +
>> +		    PVE::QemuConfig->write_config($newid, $newconf);
>> +		    $i++;
>> +		}
>> +
>> +		delete $newconf->{lock};
>> +
>> +		# do not write pending changes
>> +		if (my @changes = keys %{$newconf->{pending}}) {
>> +		    my $pending = join(',', @changes);
>> +		    warn "found pending changes for '$pending', discarding for clone\n";
>> +		    delete $newconf->{pending};
>> +		}
>> +
>> +		PVE::QemuConfig->write_config($newid, $newconf);
>> +
>> +		if ($target) {
>> +		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
>> +		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>> +		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);
>> +
>> +		    my $newconffile = PVE::QemuConfig->config_file($newid, $target);
>> +		    die "Failed to move config to node '$target' - rename failed: $!\n"
>> +			if !rename($conffile, $newconffile);
>> +		}
>> +
>> +		PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool;
>> +	    };
>> +	    if (my $err = $@) {
>> +		eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) };
>> +		sleep 1; # some storage like rbd need to wait before release volume - really?
>> +
>> +		foreach my $volid (@$newvollist) {
>> +		    eval { PVE::Storage::vdisk_free($storecfg, $volid); };
>> +		    warn $@ if $@;
>> +		}
>> +
>> +		PVE::Firewall::remove_vmfw_conf($newid);
>> +
>> +		unlink $conffile; # avoid races -> last thing before die
>> +
>> +		die "clone failed: $err";
>> +	    }
>> +
>> +	    return;
>>  	};
>>  
>>  	# Aquire exclusive lock lock for $newid
>> @@ -3287,12 +3285,18 @@ __PACKAGE__->register_method({
>>  	    return PVE::QemuConfig->lock_config_full($newid, 1, $clonefn);
>>  	};
>>  
>> -	# exclusive lock if VM is running - else shared lock is enough;
>> -	if ($running) {
>> -	    return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm);
>> -	} else {
>> -	    return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm);
>> -	}
>> +	my $lock_source_vm = sub {
>> +	    # exclusive lock if VM is running - else shared lock is enough;
>> +	    if ($running) {
>> +		return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm);
>> +	    } else {
>> +		return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm);
>> +	    }
>> +	};
>> +
>> +	$load_and_check->(); # early checks before forking/locking
>> +
>> +	return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $lock_source_vm);
>>      }});
>>  
>>  __PACKAGE__->register_method({
>> -- 
>> 2.30.2
>>
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
>>
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking
  2022-02-03  9:31     ` Fabian Ebner
@ 2022-02-03 12:49       ` Fabian Grünbichler
  0 siblings, 0 replies; 9+ messages in thread
From: Fabian Grünbichler @ 2022-02-03 12:49 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel

On February 3, 2022 10:31 am, Fabian Ebner wrote:
> Am 31.01.22 um 13:34 schrieb Fabian Grünbichler:
>> On January 27, 2022 3:01 pm, Fabian Ebner wrote:
>>> using the familiar early+repeated checks pattern from other API calls.
>>> Only intended functional changes are with regard to locking/forking.
>> 
>> two questions:
>> - the FW config cloning happens inside the worker now, while it was 
>>   previously before forking the worker (LGTM, but might be called out 
>>   explicitly if intentional ;))
> 
> Honestly, I didn't think too much about it, so thanks for pointing that
> out! But thinking about it now, I also don't see an obvious issue with
> it and IMHO it feels more natural to be part of the worker since it
> takes the firewall config lock and the cleanup also happens inside the
> worker.
> 
>> - there are some checks at the start of the endpoint (checking 
>>   storage/target), which are not repeated after the fork+lock - while 
>>   unlikely, our view of storage.cfg could change in-between (lock guest 
>>   config -> cfs_update). should those be moved in the check sub (or into 
>>   the check_storage_access_clone helper)?
>> 
> 
> Yes, for better consistency that should be done. Either way is fine with
> me. Should I send a v2 or are you going to do a follow-up?

I'll do a follow-up!

> 
>> rest of the series LGTM




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [pve-devel] applied-series: [PATCH-SERIES qemu-server] clone_disk-related improvments
  2022-01-27 14:01 [pve-devel] [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Ebner
                   ` (3 preceding siblings ...)
  2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 4/4] clone disk: don't modify drive parameter Fabian Ebner
@ 2022-02-08  8:36 ` Fabian Grünbichler
  4 siblings, 0 replies; 9+ messages in thread
From: Fabian Grünbichler @ 2022-02-08  8:36 UTC (permalink / raw)
  To: Proxmox VE development discussion

with following addition on top (shown here with `-w`), to repeat even 
more of the checks after forking/locking:

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 59e083eb..a359d096 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -3046,7 +3046,6 @@ __PACKAGE__->register_method({
 	my $vmid = extract_param($param, 'vmid');
 	my $newid = extract_param($param, 'newid');
 	my $pool = extract_param($param, 'pool');
-	$rpcenv->check_pool_exist($pool) if defined($pool);
 
         my $snapname = extract_param($param, 'snapname');
 	my $storage = extract_param($param, 'storage');
@@ -3059,6 +3058,10 @@ __PACKAGE__->register_method({
 	    undef $target;
 	}
 
+	my $running = PVE::QemuServer::check_running($vmid) || 0;
+
+	my $load_and_check = sub {
+	    $rpcenv->check_pool_exist($pool) if defined($pool);
 	    PVE::Cluster::check_node_exists($target) if $target;
 
 	    my $storecfg = PVE::Storage::config();
@@ -3071,15 +3074,13 @@ __PACKAGE__->register_method({
 		    PVE::Storage::storage_check_enabled($storecfg, $storage, $target);
 		    # clone only works if target storage is shared
 		    my $scfg = PVE::Storage::storage_config($storecfg, $storage);
-		die "can't clone to non-shared storage '$storage'\n" if !$scfg->{shared};
+		    die "can't clone to non-shared storage '$storage'\n"
+			if !$scfg->{shared};
 		}
 	    }
 
 	    PVE::Cluster::check_cfs_quorum();
 
-	my $running = PVE::QemuServer::check_running($vmid) || 0;
-
-	my $load_and_check = sub {
 	    my $conf = PVE::QemuConfig->load_config($vmid);
 	    PVE::QemuConfig->check_lock($conf);
 
@@ -3159,6 +3160,7 @@ __PACKAGE__->register_method({
 
 	my $clonefn = sub {
 	    my ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone) = $load_and_check->();
+	    my $storecfg = PVE::Storage::config();
 
 	    # auto generate a new uuid
 	    my $smbios1 = PVE::QemuServer::parse_smbios1($newconf->{smbios1} || '');

On January 27, 2022 3:01 pm, Fabian Ebner wrote:
> The second patch is fixing a concrete bug with vm_{resume,suspend} not being
> issued for a full clone of a running VM without guest agent. Other patches are
> minor improvements, mostly trying to prevent future bugs.
> 
> Fabian Ebner (4):
>   drive mirror monitor: warn when suspend/resume/freeze/thaw calls fail
>   api: clone: fork before locking
>   api: move disk: fork before locking
>   clone disk: don't modify drive parameter
> 
>  PVE/API2/Qemu.pm  | 382 ++++++++++++++++++++++++----------------------
>  PVE/QemuServer.pm |  10 +-
>  2 files changed, 205 insertions(+), 187 deletions(-)
> 
> -- 
> 2.30.2
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-02-08  8:37 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-27 14:01 [pve-devel] [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Ebner
2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 1/4] drive mirror monitor: warn when suspend/resume/freeze/thaw calls fail Fabian Ebner
2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking Fabian Ebner
2022-01-31 12:34   ` Fabian Grünbichler
2022-02-03  9:31     ` Fabian Ebner
2022-02-03 12:49       ` Fabian Grünbichler
2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 3/4] api: move disk: " Fabian Ebner
2022-01-27 14:01 ` [pve-devel] [PATCH qemu-server 4/4] clone disk: don't modify drive parameter Fabian Ebner
2022-02-08  8:36 ` [pve-devel] applied-series: [PATCH-SERIES qemu-server] clone_disk-related improvments Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal