public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server/manager] pci live migration followups
@ 2024-03-20 12:51 Dominik Csapak
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 1/3] stop cleanup: remove unnecessary tpmstate cleanup Dominik Csapak
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Dominik Csapak @ 2024-03-20 12:51 UTC (permalink / raw)
  To: pve-devel

this series is a follow up to my previous
'enable experimental support for pci live migration'
series [0] and intended to be applied on top (exceptions see at the end)

this fixes some issues that popped up during tests of the live migration
if there is a review of the v1 of the other series and this, i'll send a
v2 if necessary that includes both.

opted to send seperately, because it mostly touches independent things
and does not have to change the behaviour of the other series.

the first two patches of qemu-server could be applied regardless of the
pci live migration issue, since they fix/improve parts of the migration
cleanup in general

same with the first patch of pve-manager, as that improves the checks
for bulk-migration independent of the pci live migration.

qemu-server:

0: https://lists.proxmox.com/pipermail/pve-devel/2024-March/062226.html

Dominik Csapak (3):
  stop cleanup: remove unnecessary tpmstate cleanup
  migrate: call vm_stop_cleanup after stopping in phase3_cleanup
  api: include not mapped resources for running vms in migrate
    preconditions

 PVE/API2/Qemu.pm   | 27 +++++++++++++++------------
 PVE/QemuMigrate.pm | 12 ++++++------
 PVE/QemuServer.pm  | 20 ++++++++++----------
 3 files changed, 31 insertions(+), 28 deletions(-)

pve-manager:

Dominik Csapak (3):
  bulk migrate: improve precondition checks
  bulk migrate: include checks for live-migratable local resources
  ui: adapt migration window to precondition api change

 PVE/API2/Nodes.pm              | 27 ++++++++++++++++++++++++---
 www/manager6/window/Migrate.js | 24 ++++++++++++------------
 2 files changed, 36 insertions(+), 15 deletions(-)

-- 
2.39.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pve-devel] [PATCH qemu-server 1/3] stop cleanup: remove unnecessary tpmstate cleanup
  2024-03-20 12:51 [pve-devel] [PATCH qemu-server/manager] pci live migration followups Dominik Csapak
@ 2024-03-20 12:51 ` Dominik Csapak
  2024-03-22 14:54   ` Fiona Ebner
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 2/3] migrate: call vm_stop_cleanup after stopping in phase3_cleanup Dominik Csapak
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Dominik Csapak @ 2024-03-20 12:51 UTC (permalink / raw)
  To: pve-devel

tpmstate0 is already included in `get_vm_volumes`, and our only storage
plugin that has unmap_volume implemented is the RBDPlugin, where we call
unmap in `deactivate_volume`. So it's already ummapped by the
`deactivate_volumes` calls above.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuServer.pm | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index ef3aee20..d53e9693 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6167,14 +6167,6 @@ sub vm_stop_cleanup {
 	if (!$keepActive) {
 	    my $vollist = get_vm_volumes($conf);
 	    PVE::Storage::deactivate_volumes($storecfg, $vollist);
-
-	    if (my $tpmdrive = $conf->{tpmstate0}) {
-		my $tpm = parse_drive("tpmstate0", $tpmdrive);
-		my ($storeid, $volname) = PVE::Storage::parse_volume_id($tpm->{file}, 1);
-		if ($storeid) {
-		    PVE::Storage::unmap_volume($storecfg, $tpm->{file});
-		}
-	    }
 	}
 
 	foreach my $ext (qw(mon qmp pid vnc qga)) {
-- 
2.39.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pve-devel] [PATCH qemu-server 2/3] migrate: call vm_stop_cleanup after stopping in phase3_cleanup
  2024-03-20 12:51 [pve-devel] [PATCH qemu-server/manager] pci live migration followups Dominik Csapak
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 1/3] stop cleanup: remove unnecessary tpmstate cleanup Dominik Csapak
@ 2024-03-20 12:51 ` Dominik Csapak
  2024-03-22 15:17   ` Fiona Ebner
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions Dominik Csapak
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Dominik Csapak @ 2024-03-20 12:51 UTC (permalink / raw)
  To: pve-devel

we currently only call deactivate_volumes, but we actually want to call
the whole vm_stop_cleanup, since that is not invoked by the vm_stop
above (we cannot parse the config anymore) and might do other cleanups
we also want to do (like mdev cleanup).

For this to work properly we have to clone the original config at the
beginning, since we might modify the volids.

To get a better log output, add a `noerr` parameter to `vm_stop_cleanup`
that is enabled by default and decides if we `die` or `warn` on an
error. That way we can catch the error in the migrate code and
log it properly.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/QemuMigrate.pm | 12 ++++++------
 PVE/QemuServer.pm  | 12 ++++++++++--
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index b3570770..1be12bf1 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -5,6 +5,7 @@ use warnings;
 
 use IO::File;
 use IPC::Open2;
+use Storable qw(dclone);
 use Time::HiRes qw( usleep );
 
 use PVE::Cluster;
@@ -1460,7 +1461,8 @@ sub phase3_cleanup {
 
     my $tunnel = $self->{tunnel};
 
-    my $sourcevollist = PVE::QemuServer::get_vm_volumes($conf);
+    # we'll need an unmodified copy of the config later for the cleanup
+    my $oldconf = dclone($conf);
 
     if ($self->{volume_map} && !$self->{opts}->{remote}) {
 	my $target_drives = $self->{target_drive};
@@ -1591,12 +1593,10 @@ sub phase3_cleanup {
 	$self->{errors} = 1;
     }
 
-    # always deactivate volumes - avoid lvm LVs to be active on several nodes
-    eval {
-	PVE::Storage::deactivate_volumes($self->{storecfg}, $sourcevollist);
-    };
+    # stop with nocheck does not do a cleanup, so do it here with the original config
+    eval { PVE::QemuServer::vm_stop_cleanup($self->{storecfg}, $vmid, $oldconf, 0) };
     if (my $err = $@) {
-	$self->log('err', $err);
+	$self->log('err', "cleanup for vm failed - $err");
 	$self->{errors} = 1;
     }
 
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index d53e9693..54f73093 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6160,7 +6160,9 @@ sub cleanup_pci_devices {
 }
 
 sub vm_stop_cleanup {
-    my ($storecfg, $vmid, $conf, $keepActive, $apply_pending_changes) = @_;
+    my ($storecfg, $vmid, $conf, $keepActive, $apply_pending_changes, $noerr) = @_;
+
+    $noerr //= 1; # defaults to warning
 
     eval {
 
@@ -6186,7 +6188,13 @@ sub vm_stop_cleanup {
 
 	vmconfig_apply_pending($vmid, $conf, $storecfg) if $apply_pending_changes;
     };
-    warn $@ if $@; # avoid errors - just warn
+    if (my $err = $@) {
+	if ($noerr) {
+	    warn $err;
+	} else {
+	    die $err;
+	}
+    }
 }
 
 # call only in locked context
-- 
2.39.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions
  2024-03-20 12:51 [pve-devel] [PATCH qemu-server/manager] pci live migration followups Dominik Csapak
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 1/3] stop cleanup: remove unnecessary tpmstate cleanup Dominik Csapak
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 2/3] migrate: call vm_stop_cleanup after stopping in phase3_cleanup Dominik Csapak
@ 2024-03-20 12:51 ` Dominik Csapak
  2024-03-22 14:53   ` Stefan Sterz
  2024-03-22 16:19   ` Fiona Ebner
  2024-03-20 12:51 ` [pve-devel] [PATCH manager 1/3] bulk migrate: improve precondition checks Dominik Csapak
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: Dominik Csapak @ 2024-03-20 12:51 UTC (permalink / raw)
  To: pve-devel

so that we can show a proper warning in the migrate dialog and check it
in the bulk migrate precondition check

the unavailable_storages and allowed_nodes should be the same as before

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
not super happy with this partial approach, we probably should just
always return the 'allowed_nodes' and 'not_allowed_nodes' and change
the gui to handle the running vs not running state?

 PVE/API2/Qemu.pm | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 8581a529..b0f155f7 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -4439,7 +4439,7 @@ __PACKAGE__->register_method({
 	    not_allowed_nodes => {
 		type => 'object',
 		optional => 1,
-		description => "List not allowed nodes with additional informations, only passed if VM is offline"
+		description => "List not allowed nodes with additional informations",
 	    },
 	    local_disks => {
 		type => 'array',
@@ -4496,25 +4496,28 @@ __PACKAGE__->register_method({
 
 	# if vm is not running, return target nodes where local storage/mapped devices are available
 	# for offline migration
+	my $checked_nodes = {};
+	my $allowed_nodes = [];
 	if (!$res->{running}) {
-	    $res->{allowed_nodes} = [];
-	    my $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
+	    $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
 	    delete $checked_nodes->{$localnode};
+	}
 
-	    foreach my $node (keys %$checked_nodes) {
-		my $missing_mappings = $missing_mappings_by_node->{$node};
-		if (scalar($missing_mappings->@*)) {
-		    $checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
-		    next;
-		}
+	foreach my $node ((keys $checked_nodes->%*, keys $missing_mappings_by_node->%*)) {
+	    my $missing_mappings = $missing_mappings_by_node->{$node};
+	    if (scalar($missing_mappings->@*)) {
+		$checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
+		next;
+	    }
 
+	    if (!$res->{running}) {
 		if (!defined($checked_nodes->{$node}->{unavailable_storages})) {
-		    push @{$res->{allowed_nodes}}, $node;
+		    push $allowed_nodes->@*, $node;
 		}
-
 	    }
-	    $res->{not_allowed_nodes} = $checked_nodes;
 	}
+	$res->{not_allowed_nodes} = $checked_nodes if scalar(keys($checked_nodes->%*)) || !$res->{running};
+	$res->{allowed_nodes} = $allowed_nodes if scalar($allowed_nodes->@*) || !$res->{running};
 
 	my $local_disks = &$check_vm_disks_local($storecfg, $vmconf, $vmid);
 	$res->{local_disks} = [ values %$local_disks ];;
-- 
2.39.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pve-devel] [PATCH manager 1/3] bulk migrate: improve precondition checks
  2024-03-20 12:51 [pve-devel] [PATCH qemu-server/manager] pci live migration followups Dominik Csapak
                   ` (2 preceding siblings ...)
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions Dominik Csapak
@ 2024-03-20 12:51 ` Dominik Csapak
  2024-03-20 12:51 ` [pve-devel] [PATCH manager 2/3] bulk migrate: include checks for live-migratable local resources Dominik Csapak
  2024-03-20 12:51 ` [pve-devel] [PATCH manager 3/3] ui: adapt migration window to precondition api change Dominik Csapak
  5 siblings, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2024-03-20 12:51 UTC (permalink / raw)
  To: pve-devel

this now takes into account the 'not_allowed_nodes' hash we get from the
api call. With that, we can now limit the 'local_resources' check for
online vms only, as for offline guests, the 'unavailable-resources' hash
already includes mapped devices that don't exist on the target node.

This now also includes unavailable storages on target nodes.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Nodes.pm | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm
index cc5ee65e..1d5c68f5 100644
--- a/PVE/API2/Nodes.pm
+++ b/PVE/API2/Nodes.pm
@@ -2241,11 +2241,23 @@ my $create_migrate_worker = sub {
 	    $invalidConditions .= join(', ', map { $_->{volid} } @{$preconditions->{local_disks}});
 	}
 
-	if (@{$preconditions->{local_resources}}) {
+	if ($online && scalar($preconditions->{local_resources}->@*)) {
 	    $invalidConditions .= "\n  Has local resources: ";
 	    $invalidConditions .= join(', ', @{$preconditions->{local_resources}});
 	}
 
+	if (my $not_allowed_nodes = $preconditions->{not_allowed_nodes}) {
+	    if (my $unavail_storages = $not_allowed_nodes->{$target}->{unavailable_storages}) {
+		$invalidConditions .= "\n  Has unavailable storages: ";
+		$invalidConditions .= join(', ', $unavail_storages->@*);
+	    }
+
+	    if (my $unavail_resources = $not_allowed_nodes->{$target}->{'unavailable-resources'}) {
+		$invalidConditions .= "\n  Has unavailable resources ";
+		$invalidConditions .= join(', ', $unavail_resources->@*);
+	    }
+	}
+
 	if ($invalidConditions && $invalidConditions ne '') {
 	    print STDERR "skip VM $vmid - precondition check failed:";
 	    die "$invalidConditions\n";
-- 
2.39.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pve-devel] [PATCH manager 2/3] bulk migrate: include checks for live-migratable local resources
  2024-03-20 12:51 [pve-devel] [PATCH qemu-server/manager] pci live migration followups Dominik Csapak
                   ` (3 preceding siblings ...)
  2024-03-20 12:51 ` [pve-devel] [PATCH manager 1/3] bulk migrate: improve precondition checks Dominik Csapak
@ 2024-03-20 12:51 ` Dominik Csapak
  2024-03-20 12:51 ` [pve-devel] [PATCH manager 3/3] ui: adapt migration window to precondition api change Dominik Csapak
  5 siblings, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2024-03-20 12:51 UTC (permalink / raw)
  To: pve-devel

those should be able to migrate even for online vms. If the mapping does
not exist on the target node, that will be caught further down anyway.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 PVE/API2/Nodes.pm | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/PVE/API2/Nodes.pm b/PVE/API2/Nodes.pm
index 1d5c68f5..7397101b 100644
--- a/PVE/API2/Nodes.pm
+++ b/PVE/API2/Nodes.pm
@@ -2241,9 +2241,18 @@ my $create_migrate_worker = sub {
 	    $invalidConditions .= join(', ', map { $_->{volid} } @{$preconditions->{local_disks}});
 	}
 
+	# for a live migration all local_resources must be marked as live-migratable
 	if ($online && scalar($preconditions->{local_resources}->@*)) {
-	    $invalidConditions .= "\n  Has local resources: ";
-	    $invalidConditions .= join(', ', @{$preconditions->{local_resources}});
+	    my $resource_not_live = [];
+	    for my $resource ($preconditions->{local_resources}->@*) {
+		next if grep $resource, $preconditions->{'mapped-with-live-migration'}->@*;
+		push $resource_not_live->@*, $resource;
+	    }
+
+	    if (scalar($resource_not_live->@*)) {
+		$invalidConditions .= "\n  Has local resources not marked as live migratable: ";
+		$invalidConditions .= join(', ', $resource_not_live->@*);
+	    }
 	}
 
 	if (my $not_allowed_nodes = $preconditions->{not_allowed_nodes}) {
-- 
2.39.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* [pve-devel] [PATCH manager 3/3] ui: adapt migration window to precondition api change
  2024-03-20 12:51 [pve-devel] [PATCH qemu-server/manager] pci live migration followups Dominik Csapak
                   ` (4 preceding siblings ...)
  2024-03-20 12:51 ` [pve-devel] [PATCH manager 2/3] bulk migrate: include checks for live-migratable local resources Dominik Csapak
@ 2024-03-20 12:51 ` Dominik Csapak
  5 siblings, 0 replies; 13+ messages in thread
From: Dominik Csapak @ 2024-03-20 12:51 UTC (permalink / raw)
  To: pve-devel

we now return the 'not_allowed_nodes' also if the vm is running, when it
has mapped resources. So do that checks independently so that the
user has instant feedback where those resources exist.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
 www/manager6/window/Migrate.js | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/www/manager6/window/Migrate.js b/www/manager6/window/Migrate.js
index 21806d50..f20251b5 100644
--- a/www/manager6/window/Migrate.js
+++ b/www/manager6/window/Migrate.js
@@ -104,7 +104,7 @@ Ext.define('PVE.window.Migrate', {
 	onTargetChange: function(nodeSelector) {
 	    // Always display the storages of the currently seleceted migration target
 	    this.lookup('pveDiskStorageSelector').setNodename(nodeSelector.value);
-	    this.checkMigratePreconditions();
+	    this.checkMigratePreconditions(true);
 	},
 
 	startMigration: function() {
@@ -214,12 +214,12 @@ Ext.define('PVE.window.Migrate', {
 		migration.possible = true;
 	    }
 	    migration.preconditions = [];
+	    let target = me.lookup('pveNodeSelector').value;
+	    let disallowed = migrateStats.not_allowed_nodes?.[target] ?? {};
 
 	    if (migrateStats.allowed_nodes) {
 		migration.allowedNodes = migrateStats.allowed_nodes;
-		let target = me.lookup('pveNodeSelector').value;
 		if (target.length && !migrateStats.allowed_nodes.includes(target)) {
-		    let disallowed = migrateStats.not_allowed_nodes[target] ?? {};
 		    if (disallowed.unavailable_storages !== undefined) {
 			let missingStorages = disallowed.unavailable_storages.join(', ');
 
@@ -230,17 +230,17 @@ Ext.define('PVE.window.Migrate', {
 			    severity: 'error',
 			});
 		    }
+		}
+	    }
 
-		    if (disallowed['unavailable-resources'] !== undefined) {
-			let unavailableResources = disallowed['unavailable-resources'].join(', ');
+	    if (disallowed['unavailable-resources'] !== undefined) {
+		let unavailableResources = disallowed['unavailable-resources'].join(', ');
 
-			migration.possible = false;
-			migration.preconditions.push({
-			    text: 'Mapped Resources (' + unavailableResources + ') not available on selected target. ',
-			    severity: 'error',
-			});
-		    }
-		}
+		migration.possible = false;
+		migration.preconditions.push({
+		    text: 'Mapped Resources (' + unavailableResources + ') not available on selected target. ',
+		    severity: 'error',
+		});
 	    }
 
 	    let blockingResources = [];
-- 
2.39.2





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions Dominik Csapak
@ 2024-03-22 14:53   ` Stefan Sterz
  2024-03-22 16:19   ` Fiona Ebner
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Sterz @ 2024-03-22 14:53 UTC (permalink / raw)
  To: Proxmox VE development discussion

On Wed Mar 20, 2024 at 1:51 PM CET, Dominik Csapak wrote:
> so that we can show a proper warning in the migrate dialog and check it
> in the bulk migrate precondition check
>
> the unavailable_storages and allowed_nodes should be the same as before
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> not super happy with this partial approach, we probably should just
> always return the 'allowed_nodes' and 'not_allowed_nodes' and change
> the gui to handle the running vs not running state?
>
>  PVE/API2/Qemu.pm | 27 +++++++++++++++------------
>  1 file changed, 15 insertions(+), 12 deletions(-)
>
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 8581a529..b0f155f7 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4439,7 +4439,7 @@ __PACKAGE__->register_method({
>  	    not_allowed_nodes => {
>  		type => 'object',
>  		optional => 1,
> -		description => "List not allowed nodes with additional informations, only passed if VM is offline"
> +		description => "List not allowed nodes with additional informations",

"information" has no plural, this should just be "additional
information".

>  	    },
>  	    local_disks => {
>  		type => 'array',
> @@ -4496,25 +4496,28 @@ __PACKAGE__->register_method({
>
>  	# if vm is not running, return target nodes where local storage/mapped devices are available
>  	# for offline migration
> +	my $checked_nodes = {};
> +	my $allowed_nodes = [];
>  	if (!$res->{running}) {
> -	    $res->{allowed_nodes} = [];
> -	    my $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
> +	    $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
>  	    delete $checked_nodes->{$localnode};
> +	}
>
> -	    foreach my $node (keys %$checked_nodes) {
> -		my $missing_mappings = $missing_mappings_by_node->{$node};
> -		if (scalar($missing_mappings->@*)) {
> -		    $checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
> -		    next;
> -		}
> +	foreach my $node ((keys $checked_nodes->%*, keys $missing_mappings_by_node->%*)) {
> +	    my $missing_mappings = $missing_mappings_by_node->{$node};
> +	    if (scalar($missing_mappings->@*)) {
> +		$checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
> +		next;
> +	    }
>
> +	    if (!$res->{running}) {
>  		if (!defined($checked_nodes->{$node}->{unavailable_storages})) {
> -		    push @{$res->{allowed_nodes}}, $node;
> +		    push $allowed_nodes->@*, $node;
>  		}
> -
>  	    }
> -	    $res->{not_allowed_nodes} = $checked_nodes;
>  	}
> +	$res->{not_allowed_nodes} = $checked_nodes if scalar(keys($checked_nodes->%*)) || !$res->{running};
> +	$res->{allowed_nodes} = $allowed_nodes if scalar($allowed_nodes->@*) || !$res->{running};
>
>  	my $local_disks = &$check_vm_disks_local($storecfg, $vmconf, $vmid);
>  	$res->{local_disks} = [ values %$local_disks ];;





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 1/3] stop cleanup: remove unnecessary tpmstate cleanup
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 1/3] stop cleanup: remove unnecessary tpmstate cleanup Dominik Csapak
@ 2024-03-22 14:54   ` Fiona Ebner
  0 siblings, 0 replies; 13+ messages in thread
From: Fiona Ebner @ 2024-03-22 14:54 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Am 20.03.24 um 13:51 schrieb Dominik Csapak:
> tpmstate0 is already included in `get_vm_volumes`, and our only storage
> plugin that has unmap_volume implemented is the RBDPlugin, where we call
> unmap in `deactivate_volume`. So it's already ummapped by the
> `deactivate_volumes` calls above.
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---

There are third-party storage plugins too. But it's natural to expect
that deactivate_volume() would also remove a mapping for the volume.

OTOH, there is an explicit map_volume() call in start_swtpm() (can't use
librbd), so it's natural to expect an explicit unmap_volume() call.
However, the order of calls right now is
1. activate_volume()
2. map_volume()
3. deactivate_volume()
4. unmap_volume()

I'd rather expect 3 and 4 to be swapped if doing it "properly".

All that said, I think we can try dropping this and risk the, arguably
unlikely, scenario that a third-party plugin breaks. If it really
happens, then we can adapt according to the actual needs.

Acked-by: Fiona Ebner <f.ebner@proxmox.com>

>  PVE/QemuServer.pm | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index ef3aee20..d53e9693 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -6167,14 +6167,6 @@ sub vm_stop_cleanup {
>  	if (!$keepActive) {
>  	    my $vollist = get_vm_volumes($conf);
>  	    PVE::Storage::deactivate_volumes($storecfg, $vollist);
> -
> -	    if (my $tpmdrive = $conf->{tpmstate0}) {
> -		my $tpm = parse_drive("tpmstate0", $tpmdrive);
> -		my ($storeid, $volname) = PVE::Storage::parse_volume_id($tpm->{file}, 1);
> -		if ($storeid) {
> -		    PVE::Storage::unmap_volume($storecfg, $tpm->{file});
> -		}
> -	    }
>  	}
>  
>  	foreach my $ext (qw(mon qmp pid vnc qga)) {




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 2/3] migrate: call vm_stop_cleanup after stopping in phase3_cleanup
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 2/3] migrate: call vm_stop_cleanup after stopping in phase3_cleanup Dominik Csapak
@ 2024-03-22 15:17   ` Fiona Ebner
  0 siblings, 0 replies; 13+ messages in thread
From: Fiona Ebner @ 2024-03-22 15:17 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Am 20.03.24 um 13:51 schrieb Dominik Csapak:
> @@ -1591,12 +1593,10 @@ sub phase3_cleanup {
>  	$self->{errors} = 1;
>      }
>  
> -    # always deactivate volumes - avoid lvm LVs to be active on several nodes
> -    eval {
> -	PVE::Storage::deactivate_volumes($self->{storecfg}, $sourcevollist);
> -    };
> +    # stop with nocheck does not do a cleanup, so do it here with the original config
> +    eval { PVE::QemuServer::vm_stop_cleanup($self->{storecfg}, $vmid, $oldconf, 0) };

The function has more parameters, so this does not set noerr to 0.

>      if (my $err = $@) {
> -	$self->log('err', $err);
> +	$self->log('err', "cleanup for vm failed - $err");
>  	$self->{errors} = 1;
>      }
>  
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index d53e9693..54f73093 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -6160,7 +6160,9 @@ sub cleanup_pci_devices {
>  }
>  
>  sub vm_stop_cleanup {
> -    my ($storecfg, $vmid, $conf, $keepActive, $apply_pending_changes) = @_;
> +    my ($storecfg, $vmid, $conf, $keepActive, $apply_pending_changes, $noerr) = @_;
> +
> +    $noerr //= 1; # defaults to warning

Not too happy about this, because usually passing undef and 0 for a
"boolean" is the same, but not here. And Perl won't help you. There's
not many callers, maybe just adapt them instead? Should be its own patch
then.

>  
>      eval {
>  
> @@ -6186,7 +6188,13 @@ sub vm_stop_cleanup {
>  
>  	vmconfig_apply_pending($vmid, $conf, $storecfg) if $apply_pending_changes;
>      };
> -    warn $@ if $@; # avoid errors - just warn
> +    if (my $err = $@) {
> +	if ($noerr) {
> +	    warn $err;
> +	} else {
> +	    die $err;
> +	}

Style nit: we usually have something like
die $err if !$noerr;
warn $err;
which avoids the line bloat.

> +    }
>  }
>  
>  # call only in locked context




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions
  2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions Dominik Csapak
  2024-03-22 14:53   ` Stefan Sterz
@ 2024-03-22 16:19   ` Fiona Ebner
  2024-04-02  9:39     ` Dominik Csapak
  1 sibling, 1 reply; 13+ messages in thread
From: Fiona Ebner @ 2024-03-22 16:19 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

Am 20.03.24 um 13:51 schrieb Dominik Csapak:
> so that we can show a proper warning in the migrate dialog and check it
> in the bulk migrate precondition check
> 
> the unavailable_storages and allowed_nodes should be the same as before
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> not super happy with this partial approach, we probably should just
> always return the 'allowed_nodes' and 'not_allowed_nodes' and change
> the gui to handle the running vs not running state?

So not_allowed_nodes can already be returned in both states after this
patch. But allowed nodes still only if not running. I mean, there could
be API users that break if we'd always return allowed_nodes, but it
doesn't sound unreasonable for me to do so. Might even be an opportunity
to structure the code in a bit more straightforward manner.

> 
>  PVE/API2/Qemu.pm | 27 +++++++++++++++------------
>  1 file changed, 15 insertions(+), 12 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 8581a529..b0f155f7 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -4439,7 +4439,7 @@ __PACKAGE__->register_method({
>  	    not_allowed_nodes => {
>  		type => 'object',
>  		optional => 1,
> -		description => "List not allowed nodes with additional informations, only passed if VM is offline"
> +		description => "List not allowed nodes with additional informations",
>  	    },
>  	    local_disks => {
>  		type => 'array',
> @@ -4496,25 +4496,28 @@ __PACKAGE__->register_method({
>  
>  	# if vm is not running, return target nodes where local storage/mapped devices are available
>  	# for offline migration
> +	my $checked_nodes = {};
> +	my $allowed_nodes = [];
>  	if (!$res->{running}) {
> -	    $res->{allowed_nodes} = [];
> -	    my $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
> +	    $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
>  	    delete $checked_nodes->{$localnode};
> +	}
>  
> -	    foreach my $node (keys %$checked_nodes) {
> -		my $missing_mappings = $missing_mappings_by_node->{$node};
> -		if (scalar($missing_mappings->@*)) {
> -		    $checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
> -		    next;
> -		}
> +	foreach my $node ((keys $checked_nodes->%*, keys $missing_mappings_by_node->%*)) {

Style nit: please use 'for' instead of 'foreach'

Like this you might iterate over certain nodes twice and then push them
onto the allowed_nodes array twice.

> +	    my $missing_mappings = $missing_mappings_by_node->{$node};
> +	    if (scalar($missing_mappings->@*)) {
> +		$checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
> +		next;
> +	    }
>  
> +	    if (!$res->{running}) {
>  		if (!defined($checked_nodes->{$node}->{unavailable_storages})) {
> -		    push @{$res->{allowed_nodes}}, $node;
> +		    push $allowed_nodes->@*, $node;
>  		}
> -
>  	    }
> -	    $res->{not_allowed_nodes} = $checked_nodes;
>  	}
> +	$res->{not_allowed_nodes} = $checked_nodes if scalar(keys($checked_nodes->%*)) || !$res->{running};

Why not return the empty hash if running? The whole post-if is just
covering that single special case.

> +	$res->{allowed_nodes} = $allowed_nodes if scalar($allowed_nodes->@*) || !$res->{running};

Nit: Right now, $allowed_nodes can only be non-empty if
!$res->{running}, so the first part of the check is redundant.

>  
>  	my $local_disks = &$check_vm_disks_local($storecfg, $vmconf, $vmid);
>  	$res->{local_disks} = [ values %$local_disks ];;




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions
  2024-03-22 16:19   ` Fiona Ebner
@ 2024-04-02  9:39     ` Dominik Csapak
  2024-04-10 10:52       ` Fiona Ebner
  0 siblings, 1 reply; 13+ messages in thread
From: Dominik Csapak @ 2024-04-02  9:39 UTC (permalink / raw)
  To: Fiona Ebner, Proxmox VE development discussion

On 3/22/24 17:19, Fiona Ebner wrote:
> Am 20.03.24 um 13:51 schrieb Dominik Csapak:
>> so that we can show a proper warning in the migrate dialog and check it
>> in the bulk migrate precondition check
>>
>> the unavailable_storages and allowed_nodes should be the same as before
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>> not super happy with this partial approach, we probably should just
>> always return the 'allowed_nodes' and 'not_allowed_nodes' and change
>> the gui to handle the running vs not running state?
> 
> So not_allowed_nodes can already be returned in both states after this
> patch. But allowed nodes still only if not running. I mean, there could
> be API users that break if we'd always return allowed_nodes, but it
> doesn't sound unreasonable for me to do so. Might even be an opportunity
> to structure the code in a bit more straightforward manner.

yes, as said previosly i'd like this api call a bit to make it more practical
but that probably has to wait for the next major release

as for returning 'allowed_nodes' always, we'd have to adapt the gui of course,
but if we don't deem it 'too breaking' i'd rework that a bit even now

> 
>>
>>   PVE/API2/Qemu.pm | 27 +++++++++++++++------------
>>   1 file changed, 15 insertions(+), 12 deletions(-)
>>
>> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>> index 8581a529..b0f155f7 100644
>> --- a/PVE/API2/Qemu.pm
>> +++ b/PVE/API2/Qemu.pm
>> @@ -4439,7 +4439,7 @@ __PACKAGE__->register_method({
>>   	    not_allowed_nodes => {
>>   		type => 'object',
>>   		optional => 1,
>> -		description => "List not allowed nodes with additional informations, only passed if VM is offline"
>> +		description => "List not allowed nodes with additional informations",
>>   	    },
>>   	    local_disks => {
>>   		type => 'array',
>> @@ -4496,25 +4496,28 @@ __PACKAGE__->register_method({
>>   
>>   	# if vm is not running, return target nodes where local storage/mapped devices are available
>>   	# for offline migration
>> +	my $checked_nodes = {};
>> +	my $allowed_nodes = [];
>>   	if (!$res->{running}) {
>> -	    $res->{allowed_nodes} = [];
>> -	    my $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
>> +	    $checked_nodes = PVE::QemuServer::check_local_storage_availability($vmconf, $storecfg);
>>   	    delete $checked_nodes->{$localnode};
>> +	}
>>   
>> -	    foreach my $node (keys %$checked_nodes) {
>> -		my $missing_mappings = $missing_mappings_by_node->{$node};
>> -		if (scalar($missing_mappings->@*)) {
>> -		    $checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
>> -		    next;
>> -		}
>> +	foreach my $node ((keys $checked_nodes->%*, keys $missing_mappings_by_node->%*)) {
> 
> Style nit: please use 'for' instead of 'foreach'
> 
> Like this you might iterate over certain nodes twice and then push them
> onto the allowed_nodes array twice.

oops, yes ^^

> 
>> +	    my $missing_mappings = $missing_mappings_by_node->{$node};
>> +	    if (scalar($missing_mappings->@*)) {
>> +		$checked_nodes->{$node}->{'unavailable-resources'} = $missing_mappings;
>> +		next;
>> +	    }
>>   
>> +	    if (!$res->{running}) {
>>   		if (!defined($checked_nodes->{$node}->{unavailable_storages})) {
>> -		    push @{$res->{allowed_nodes}}, $node;
>> +		    push $allowed_nodes->@*, $node;
>>   		}
>> -
>>   	    }
>> -	    $res->{not_allowed_nodes} = $checked_nodes;
>>   	}
>> +	$res->{not_allowed_nodes} = $checked_nodes if scalar(keys($checked_nodes->%*)) || !$res->{running};
> 
> Why not return the empty hash if running? The whole post-if is just
> covering that single special case.
> 
>> +	$res->{allowed_nodes} = $allowed_nodes if scalar($allowed_nodes->@*) || !$res->{running};
> 
> Nit: Right now, $allowed_nodes can only be non-empty if
> !$res->{running}, so the first part of the check is redundant.
> 

true

>>   
>>   	my $local_disks = &$check_vm_disks_local($storecfg, $vmconf, $vmid);
>>   	$res->{local_disks} = [ values %$local_disks ];;





^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions
  2024-04-02  9:39     ` Dominik Csapak
@ 2024-04-10 10:52       ` Fiona Ebner
  0 siblings, 0 replies; 13+ messages in thread
From: Fiona Ebner @ 2024-04-10 10:52 UTC (permalink / raw)
  To: Dominik Csapak, Proxmox VE development discussion

Am 02.04.24 um 11:39 schrieb Dominik Csapak:
> On 3/22/24 17:19, Fiona Ebner wrote:
>> Am 20.03.24 um 13:51 schrieb Dominik Csapak:
>>> so that we can show a proper warning in the migrate dialog and check it
>>> in the bulk migrate precondition check
>>>
>>> the unavailable_storages and allowed_nodes should be the same as before
>>>
>>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>>> ---
>>> not super happy with this partial approach, we probably should just
>>> always return the 'allowed_nodes' and 'not_allowed_nodes' and change
>>> the gui to handle the running vs not running state?
>>
>> So not_allowed_nodes can already be returned in both states after this
>> patch. But allowed nodes still only if not running. I mean, there could
>> be API users that break if we'd always return allowed_nodes, but it
>> doesn't sound unreasonable for me to do so. Might even be an opportunity
>> to structure the code in a bit more straightforward manner.
> 
> yes, as said previosly i'd like this api call a bit to make it more
> practical
> but that probably has to wait for the next major release
> 
> as for returning 'allowed_nodes' always, we'd have to adapt the gui of
> course,
> but if we don't deem it 'too breaking' i'd rework that a bit even now
> 

Thinking about it in general for existing API users:

1. If allowed_nodes is not checked for live-migration, no breakage.

2. If allowed_nodes is checked for live-migration, the API user just
becomes more accurate (as long as what we return is correct).

3. If there is an assert that allowed_nodes is not returned for
live-migration, breakage.

4. If presence of allowed_nodes is used to guess whether it's a
live-migration or not, breakage. But this is just a bug IMHO.




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-04-10 10:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-20 12:51 [pve-devel] [PATCH qemu-server/manager] pci live migration followups Dominik Csapak
2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 1/3] stop cleanup: remove unnecessary tpmstate cleanup Dominik Csapak
2024-03-22 14:54   ` Fiona Ebner
2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 2/3] migrate: call vm_stop_cleanup after stopping in phase3_cleanup Dominik Csapak
2024-03-22 15:17   ` Fiona Ebner
2024-03-20 12:51 ` [pve-devel] [PATCH qemu-server 3/3] api: include not mapped resources for running vms in migrate preconditions Dominik Csapak
2024-03-22 14:53   ` Stefan Sterz
2024-03-22 16:19   ` Fiona Ebner
2024-04-02  9:39     ` Dominik Csapak
2024-04-10 10:52       ` Fiona Ebner
2024-03-20 12:51 ` [pve-devel] [PATCH manager 1/3] bulk migrate: improve precondition checks Dominik Csapak
2024-03-20 12:51 ` [pve-devel] [PATCH manager 2/3] bulk migrate: include checks for live-migratable local resources Dominik Csapak
2024-03-20 12:51 ` [pve-devel] [PATCH manager 3/3] ui: adapt migration window to precondition api change Dominik Csapak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal