* [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497)
@ 2025-12-15 15:52 Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 1/9] ha: put source files on individual new lines Daniel Kral
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
This patch series implements node affinity rule migration blockers
similar to the blockers introduced with resource affinity rules.
The node affinity rule migraton blockers prevent users from migrating HA
resources to nodes, which would make them migrate somewhere else
immediately afterwards. This includes:
- online nodes, which are not part of the strict node affinity rule's
allowed node set at all, or
- online nodes, which are not in the currently highest priority group of
the strict or non-strict node affinity rule and the HA resource has
failback set.
The first few patches are some overall cleanup for things the series
touches + deduplicating the resource_motion_info logic and sharing it
between the Manager and the public
PVE::HA::Config::get_resource_motion_info(...), as well as exposing these
in the relevant VM/LXC API handlers and web interface.
Since the 'cause' property in 'blocking-resources' is an enum,
qemu-server and pve-container need version bumps for pve-ha-manager as
else the API handlers would result in a schema result verification error.
If we can spare some coupling between those, we could generate the schema
for these API handlers from the pve-ha-manager package, so future
versioned bumps are not needed anymore, but I was a bit hesitant to
implement that in the v1.
ha-manager:
Daniel Kral (9):
ha: put source files on individual new lines
d/pve-ha-manager.install: remove duplicate Config.pm
config: group and sort use statements
manager: group and sort use statements
manager: report all reasons when resources are blocked from migration
config, manager: factor out resource motion info logic
tests: add test cases for migrating resources with node affinity rules
handle strict node affinity rules in manual migrations
handle node affinity rules with failback in manual migrations
debian/pve-ha-manager.install | 2 +-
src/PVE/API2/HA/Resources.pm | 4 +-
src/PVE/CLI/ha_manager.pm | 14 ++--
src/PVE/HA/Config.pm | 47 +++++---------
src/PVE/HA/Helpers.pm | 63 ++++++++++++++++++
src/PVE/HA/Makefile | 16 ++++-
src/PVE/HA/Manager.pm | 61 ++++++++---------
.../test-node-affinity-nonstrict1/log.expect | 16 +----
src/test/test-node-affinity-nonstrict7/README | 9 +++
.../test-node-affinity-nonstrict7/cmdlist | 9 +++
.../hardware_status | 5 ++
.../test-node-affinity-nonstrict7/log.expect | 65 +++++++++++++++++++
.../manager_status | 1 +
.../rules_config | 7 ++
.../service_config | 4 ++
.../test-node-affinity-strict1/log.expect | 16 +----
.../test-node-affinity-strict2/log.expect | 16 +----
src/test/test-node-affinity-strict7/README | 9 +++
src/test/test-node-affinity-strict7/cmdlist | 9 +++
.../hardware_status | 5 ++
.../test-node-affinity-strict7/log.expect | 51 +++++++++++++++
.../test-node-affinity-strict7/manager_status | 1 +
.../test-node-affinity-strict7/rules_config | 9 +++
.../test-node-affinity-strict7/service_config | 4 ++
src/test/test-recovery4/log.expect | 2 +-
25 files changed, 327 insertions(+), 118 deletions(-)
create mode 100644 src/PVE/HA/Helpers.pm
create mode 100644 src/test/test-node-affinity-nonstrict7/README
create mode 100644 src/test/test-node-affinity-nonstrict7/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict7/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict7/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict7/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict7/rules_config
create mode 100644 src/test/test-node-affinity-nonstrict7/service_config
create mode 100644 src/test/test-node-affinity-strict7/README
create mode 100644 src/test/test-node-affinity-strict7/cmdlist
create mode 100644 src/test/test-node-affinity-strict7/hardware_status
create mode 100644 src/test/test-node-affinity-strict7/log.expect
create mode 100644 src/test/test-node-affinity-strict7/manager_status
create mode 100644 src/test/test-node-affinity-strict7/rules_config
create mode 100644 src/test/test-node-affinity-strict7/service_config
qemu-server:
Daniel Kral (1):
api: migration preconditions: add node affinity as blocking cause
src/PVE/API2/Qemu.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
container:
Daniel Kral (1):
api: migration preconditions: add node affinity as blocking cause
src/PVE/API2/LXC.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
manager:
Daniel Kral (1):
ui: migrate: display precondition messages for ha node affinity
www/manager6/window/Migrate.js | 10 ++++++++++
1 file changed, 10 insertions(+)
Summary over all repositories:
28 files changed, 339 insertions(+), 120 deletions(-)
--
Generated by murpp 0.9.0
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 1/9] ha: put source files on individual new lines
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 2/9] d/pve-ha-manager.install: remove duplicate Config.pm Daniel Kral
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
There are quite a lot of source files in the list already. To reduce
noise in diffs on changes here, put each on a newline.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Makefile | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/src/PVE/HA/Makefile b/src/PVE/HA/Makefile
index 0b240e1e..1aeb976b 100644
--- a/src/PVE/HA/Makefile
+++ b/src/PVE/HA/Makefile
@@ -1,5 +1,16 @@
-SIM_SOURCES=CRM.pm Env.pm Groups.pm HashTools.pm Rules.pm Resources.pm LRM.pm \
- Manager.pm NodeStatus.pm Tools.pm FenceConfig.pm Fence.pm Usage.pm
+SIM_SOURCES=CRM.pm \
+ Env.pm \
+ Groups.pm \
+ HashTools.pm \
+ Rules.pm \
+ Resources.pm \
+ LRM.pm \
+ Manager.pm \
+ NodeStatus.pm \
+ Tools.pm \
+ FenceConfig.pm \
+ Fence.pm \
+ Usage.pm
SOURCES=${SIM_SOURCES} Config.pm
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 2/9] d/pve-ha-manager.install: remove duplicate Config.pm
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 1/9] ha: put source files on individual new lines Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 3/9] config: group and sort use statements Daniel Kral
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
debian/pve-ha-manager.install | 1 -
1 file changed, 1 deletion(-)
diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
index 38d5d60b..bdb1feeb 100644
--- a/debian/pve-ha-manager.install
+++ b/debian/pve-ha-manager.install
@@ -21,7 +21,6 @@
/usr/share/perl5/PVE/CLI/ha_manager.pm
/usr/share/perl5/PVE/HA/CRM.pm
/usr/share/perl5/PVE/HA/Config.pm
-/usr/share/perl5/PVE/HA/Config.pm
/usr/share/perl5/PVE/HA/Env.pm
/usr/share/perl5/PVE/HA/Env/PVE2.pm
/usr/share/perl5/PVE/HA/Fence.pm
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 3/9] config: group and sort use statements
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 1/9] ha: put source files on individual new lines Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 2/9] d/pve-ha-manager.install: remove duplicate Config.pm Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 4/9] manager: " Daniel Kral
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Group and sort use statements according to our Perl Style guide [0].
[0] https://pve.proxmox.com/wiki/Perl_Style_Guide#Module_Dependencies
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Config.pm | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index 04e039e0..1199b0d4 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -5,12 +5,13 @@ use warnings;
use JSON;
-use PVE::HA::Tools;
+use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_write_file cfs_lock_file);
+
use PVE::HA::Groups;
+use PVE::HA::Resources;
use PVE::HA::Rules;
use PVE::HA::Rules::ResourceAffinity qw(get_affinitive_resources);
-use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_write_file cfs_lock_file);
-use PVE::HA::Resources;
+use PVE::HA::Tools;
my $manager_status_filename = "ha/manager_status";
my $ha_groups_config = "ha/groups.cfg";
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 4/9] manager: group and sort use statements
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (2 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 3/9] config: group and sort use statements Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 5/9] manager: report all reasons when resources are blocked from migration Daniel Kral
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Group and sort use statements according to our Perl Style guide [0].
[0] https://pve.proxmox.com/wiki/Perl_Style_Guide#Module_Dependencies
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Manager.pm | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index f5843dd4..95bddabc 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -6,13 +6,14 @@ use warnings;
use Digest::MD5 qw(md5_base64);
use PVE::Tools;
+
use PVE::HA::Groups;
-use PVE::HA::Tools ':exit_codes';
use PVE::HA::NodeStatus;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity qw(get_node_affinity);
use PVE::HA::Rules::ResourceAffinity
qw(get_affinitive_resources get_resource_affinity apply_positive_resource_affinity apply_negative_resource_affinity);
+use PVE::HA::Tools ':exit_codes';
use PVE::HA::Usage::Basic;
my $have_static_scheduling;
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 5/9] manager: report all reasons when resources are blocked from migration
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (3 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 4/9] manager: " Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 6/9] config, manager: factor out resource motion info logic Daniel Kral
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
PVE::HA::Config::get_resource_motion_info(...) already reports all
reasons to callers, so log that information in the HA Manager as well.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/HA/Manager.pm | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 95bddabc..74e898f9 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -393,6 +393,7 @@ sub execute_migration {
my $resource_affinity = $self->{compiled_rules}->{'resource-affinity'};
my ($together, $separate) = get_affinitive_resources($resource_affinity, $sid);
+ my $blocked_from_migration;
for my $csid (sort keys %$separate) {
next if !defined($ss->{$csid});
next if $ss->{$csid}->{state} eq 'ignored';
@@ -405,9 +406,11 @@ sub execute_migration {
. " negative affinity with service '$sid'",
);
- return; # one negative resource affinity is enough to not execute migration
+ $blocked_from_migration = 1;
}
+ return if $blocked_from_migration;
+
$haenv->log('info', "got crm command: $cmd");
$ss->{$sid}->{cmd} = [$task, $target];
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 6/9] config, manager: factor out resource motion info logic
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (4 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 5/9] manager: report all reasons when resources are blocked from migration Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 7/9] tests: add test cases for migrating resources with node affinity rules Daniel Kral
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
The logic in execute_migration(...) and get_resource_motion_info(...) to
gather dependent and blocking HA resources is equivalent and should also
be the same for consistency, so factor them out as a separate helper.
The PVE::HA::Helpers package is introduced since there does not exist a
package for shared logic between packages, which cannot depend on each
other (e.g. Manager and Config, LRM and CRM, etc.) and PVE::HA::Tools is
not the right place for these.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
debian/pve-ha-manager.install | 1 +
src/PVE/HA/Config.pm | 31 ++++-----------------
src/PVE/HA/Helpers.pm | 52 +++++++++++++++++++++++++++++++++++
src/PVE/HA/Makefile | 1 +
src/PVE/HA/Manager.pm | 45 +++++++++++++-----------------
5 files changed, 78 insertions(+), 52 deletions(-)
create mode 100644 src/PVE/HA/Helpers.pm
diff --git a/debian/pve-ha-manager.install b/debian/pve-ha-manager.install
index bdb1feeb..6ee0ee5d 100644
--- a/debian/pve-ha-manager.install
+++ b/debian/pve-ha-manager.install
@@ -27,6 +27,7 @@
/usr/share/perl5/PVE/HA/FenceConfig.pm
/usr/share/perl5/PVE/HA/Groups.pm
/usr/share/perl5/PVE/HA/HashTools.pm
+/usr/share/perl5/PVE/HA/Helpers.pm
/usr/share/perl5/PVE/HA/LRM.pm
/usr/share/perl5/PVE/HA/Manager.pm
/usr/share/perl5/PVE/HA/NodeStatus.pm
diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index 1199b0d4..f8c5965e 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -8,9 +8,9 @@ use JSON;
use PVE::Cluster qw(cfs_register_file cfs_read_file cfs_write_file cfs_lock_file);
use PVE::HA::Groups;
+use PVE::HA::Helpers;
use PVE::HA::Resources;
use PVE::HA::Rules;
-use PVE::HA::Rules::ResourceAffinity qw(get_affinitive_resources);
use PVE::HA::Tools;
my $manager_status_filename = "ha/manager_status";
@@ -391,34 +391,13 @@ sub get_resource_motion_info {
my $manager_status = read_manager_status();
my $ss = $manager_status->{service_status};
my $ns = $manager_status->{node_status};
+ # get_resource_motion_info expects a hashset of all nodes with status 'online'
+ my $online_nodes = { map { $ns->{$_} eq 'online' ? ($_ => 1) : () } keys %$ns };
my $compiled_rules = read_and_compile_rules_config();
- my $resource_affinity = $compiled_rules->{'resource-affinity'};
- my ($together, $separate) = get_affinitive_resources($resource_affinity, $sid);
- for my $csid (sort keys %$together) {
- next if !defined($ss->{$csid});
- next if $ss->{$csid}->{state} eq 'ignored';
-
- push @$dependent_resources, $csid;
- }
-
- for my $node (keys %$ns) {
- next if $ns->{$node} ne 'online';
-
- for my $csid (sort keys %$separate) {
- next if !defined($ss->{$csid});
- next if $ss->{$csid}->{state} eq 'ignored';
- next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $node;
- next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $node;
-
- push $blocking_resources_by_node->{$node}->@*,
- {
- sid => $csid,
- cause => 'resource-affinity',
- };
- }
- }
+ ($dependent_resources, $blocking_resources_by_node) =
+ PVE::HA::Helpers::get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules);
}
return ($dependent_resources, $blocking_resources_by_node);
diff --git a/src/PVE/HA/Helpers.pm b/src/PVE/HA/Helpers.pm
new file mode 100644
index 00000000..09300cd4
--- /dev/null
+++ b/src/PVE/HA/Helpers.pm
@@ -0,0 +1,52 @@
+package PVE::HA::Helpers;
+
+use v5.36;
+
+use PVE::HA::Rules::ResourceAffinity qw(get_affinitive_resources);
+
+=head3 get_resource_motion_info
+
+Gathers which other HA resources in C<$ss> put a node placement dependency or
+node placement restriction on C<$sid> according to the compiled rules in
+C<$compiled_rules> and the online nodes in C<$online_nodes>.
+
+Returns a list of two elements, where the first element is a list of HA resource
+ids which are dependent on the node placement of C<$sid>, and the second element
+is a hash of nodes blocked for C<$sid>, where each entry value is a list of the
+causes that make the node unavailable to C<$sid>.
+
+=cut
+
+sub get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules) {
+ my $dependent_resources = [];
+ my $blocking_resources_by_node = {};
+
+ my $resource_affinity = $compiled_rules->{'resource-affinity'};
+ my ($together, $separate) = get_affinitive_resources($resource_affinity, $sid);
+
+ for my $csid (sort keys %$together) {
+ next if !defined($ss->{$csid});
+ next if $ss->{$csid}->{state} eq 'ignored';
+
+ push @$dependent_resources, $csid;
+ }
+
+ for my $node (keys %$online_nodes) {
+ for my $csid (sort keys %$separate) {
+ next if !defined($ss->{$csid});
+ next if $ss->{$csid}->{state} eq 'ignored';
+ next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $node;
+ next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $node;
+
+ push $blocking_resources_by_node->{$node}->@*,
+ {
+ sid => $csid,
+ cause => 'resource-affinity',
+ };
+ }
+ }
+
+ return ($dependent_resources, $blocking_resources_by_node);
+}
+
+1;
diff --git a/src/PVE/HA/Makefile b/src/PVE/HA/Makefile
index 1aeb976b..57871b29 100644
--- a/src/PVE/HA/Makefile
+++ b/src/PVE/HA/Makefile
@@ -2,6 +2,7 @@ SIM_SOURCES=CRM.pm \
Env.pm \
Groups.pm \
HashTools.pm \
+ Helpers.pm \
Rules.pm \
Resources.pm \
LRM.pm \
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 74e898f9..470df92c 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -8,11 +8,12 @@ use Digest::MD5 qw(md5_base64);
use PVE::Tools;
use PVE::HA::Groups;
+use PVE::HA::Helpers;
use PVE::HA::NodeStatus;
use PVE::HA::Rules;
use PVE::HA::Rules::NodeAffinity qw(get_node_affinity);
use PVE::HA::Rules::ResourceAffinity
- qw(get_affinitive_resources get_resource_affinity apply_positive_resource_affinity apply_negative_resource_affinity);
+ qw(get_resource_affinity apply_positive_resource_affinity apply_negative_resource_affinity);
use PVE::HA::Tools ':exit_codes';
use PVE::HA::Usage::Basic;
@@ -388,43 +389,35 @@ sub read_lrm_status {
sub execute_migration {
my ($self, $cmd, $task, $sid, $target) = @_;
- my ($haenv, $ss) = $self->@{qw(haenv ss)};
+ my ($haenv, $ss, $ns, $compiled_rules) = $self->@{qw(haenv ss ns compiled_rules)};
+ my $online_nodes = { map { $_ => 1 } $self->{ns}->list_online_nodes()->@* };
- my $resource_affinity = $self->{compiled_rules}->{'resource-affinity'};
- my ($together, $separate) = get_affinitive_resources($resource_affinity, $sid);
+ my ($dependent_resources, $blocking_resources_by_node) =
+ PVE::HA::Helpers::get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules);
- my $blocked_from_migration;
- for my $csid (sort keys %$separate) {
- next if !defined($ss->{$csid});
- next if $ss->{$csid}->{state} eq 'ignored';
- next if $ss->{$csid}->{node} && $ss->{$csid}->{node} ne $target;
- next if $ss->{$csid}->{target} && $ss->{$csid}->{target} ne $target;
+ if (my $blocking_resources = $blocking_resources_by_node->{$target}) {
+ for my $blocking_resource (@$blocking_resources) {
+ my $err_msg = "unknown migration blocker reason";
+ my ($csid, $cause) = $blocking_resource->@{qw(sid cause)};
- $haenv->log(
- 'err',
- "crm command '$cmd' error - service '$csid' on node '$target' in"
- . " negative affinity with service '$sid'",
- );
+ if ($cause eq 'resource-affinity') {
+ $err_msg = "service '$csid' on node '$target' in negative"
+ . " affinity with service '$sid'";
+ }
- $blocked_from_migration = 1;
+ $haenv->log('err', "crm command '$cmd' error - $err_msg");
+ }
+
+ return; # do not queue migration if there are blockers
}
- return if $blocked_from_migration;
-
$haenv->log('info', "got crm command: $cmd");
$ss->{$sid}->{cmd} = [$task, $target];
- my $resources_to_migrate = [];
- for my $csid (sort keys %$together) {
- next if !defined($ss->{$csid});
- next if $ss->{$csid}->{state} eq 'ignored';
+ for my $csid (@$dependent_resources) {
next if $ss->{$csid}->{node} && $ss->{$csid}->{node} eq $target;
next if $ss->{$csid}->{target} && $ss->{$csid}->{target} eq $target;
- push @$resources_to_migrate, $csid;
- }
-
- for my $csid (@$resources_to_migrate) {
$haenv->log(
'info',
"crm command '$cmd' - $task service '$csid' to node '$target'"
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 7/9] tests: add test cases for migrating resources with node affinity rules
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (5 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 6/9] config, manager: factor out resource motion info logic Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 8/9] handle strict node affinity rules in manual migrations Daniel Kral
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
These test cases show the current behavior of manual HA resource
migrations, where the HA resource is in a strict or non-strict node
affinity rule.
These are added in preparation of preventing manual HA resource
migrations/relocations to nodes, which are not in the allowed set
according to the HA resource's node affinity rules.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/test/test-node-affinity-nonstrict7/README | 9 ++
.../test-node-affinity-nonstrict7/cmdlist | 9 ++
.../hardware_status | 5 ++
.../test-node-affinity-nonstrict7/log.expect | 89 +++++++++++++++++++
.../manager_status | 1 +
.../rules_config | 7 ++
.../service_config | 4 +
src/test/test-node-affinity-strict7/README | 9 ++
src/test/test-node-affinity-strict7/cmdlist | 9 ++
.../hardware_status | 5 ++
.../test-node-affinity-strict7/log.expect | 87 ++++++++++++++++++
.../test-node-affinity-strict7/manager_status | 1 +
.../test-node-affinity-strict7/rules_config | 9 ++
.../test-node-affinity-strict7/service_config | 4 +
14 files changed, 248 insertions(+)
create mode 100644 src/test/test-node-affinity-nonstrict7/README
create mode 100644 src/test/test-node-affinity-nonstrict7/cmdlist
create mode 100644 src/test/test-node-affinity-nonstrict7/hardware_status
create mode 100644 src/test/test-node-affinity-nonstrict7/log.expect
create mode 100644 src/test/test-node-affinity-nonstrict7/manager_status
create mode 100644 src/test/test-node-affinity-nonstrict7/rules_config
create mode 100644 src/test/test-node-affinity-nonstrict7/service_config
create mode 100644 src/test/test-node-affinity-strict7/README
create mode 100644 src/test/test-node-affinity-strict7/cmdlist
create mode 100644 src/test/test-node-affinity-strict7/hardware_status
create mode 100644 src/test/test-node-affinity-strict7/log.expect
create mode 100644 src/test/test-node-affinity-strict7/manager_status
create mode 100644 src/test/test-node-affinity-strict7/rules_config
create mode 100644 src/test/test-node-affinity-strict7/service_config
diff --git a/src/test/test-node-affinity-nonstrict7/README b/src/test/test-node-affinity-nonstrict7/README
new file mode 100644
index 00000000..35e532cc
--- /dev/null
+++ b/src/test/test-node-affinity-nonstrict7/README
@@ -0,0 +1,9 @@
+Test whether services in a non-strict node affinity rule handle manual
+migrations to nodes as expected with respect to whether these are part of the
+node affinity rule or not.
+
+The test scenario is:
+- vm:101 should be kept on node1 or node3 (preferred)
+- vm:102 should be kept on node1 or node2 (preferred)
+- vm:101 is running on node3 with failback enabled
+- vm:102 is running on node3 with failback disabled
diff --git a/src/test/test-node-affinity-nonstrict7/cmdlist b/src/test/test-node-affinity-nonstrict7/cmdlist
new file mode 100644
index 00000000..d992c805
--- /dev/null
+++ b/src/test/test-node-affinity-nonstrict7/cmdlist
@@ -0,0 +1,9 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on"],
+ [ "service vm:101 migrate node1" ],
+ [ "service vm:101 migrate node2" ],
+ [ "service vm:101 migrate node3" ],
+ [ "service vm:102 migrate node3" ],
+ [ "service vm:102 migrate node2" ],
+ [ "service vm:102 migrate node1" ]
+]
diff --git a/src/test/test-node-affinity-nonstrict7/hardware_status b/src/test/test-node-affinity-nonstrict7/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-node-affinity-nonstrict7/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-node-affinity-nonstrict7/log.expect b/src/test/test-node-affinity-nonstrict7/log.expect
new file mode 100644
index 00000000..31daa618
--- /dev/null
+++ b/src/test/test-node-affinity-nonstrict7/log.expect
@@ -0,0 +1,89 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 120 cmdlist: execute service vm:101 migrate node1
+info 120 node1/crm: got crm command: migrate vm:101 node1
+info 120 node1/crm: migrate service 'vm:101' to node 'node1'
+info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 121 node1/lrm: got lock 'ha_agent_node1_lock'
+info 121 node1/lrm: status change wait_for_agent_lock => active
+info 125 node3/lrm: service vm:101 - start migrate to node 'node1'
+info 125 node3/lrm: service vm:101 - end migrate to node 'node1'
+info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
+info 140 node1/crm: migrate service 'vm:101' to node 'node3' (running)
+info 140 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3)
+info 141 node1/lrm: service vm:101 - start migrate to node 'node3'
+info 141 node1/lrm: service vm:101 - end migrate to node 'node3'
+info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
+info 165 node3/lrm: starting service vm:101
+info 165 node3/lrm: service status vm:101 started
+info 220 cmdlist: execute service vm:101 migrate node2
+info 220 node1/crm: got crm command: migrate vm:101 node2
+info 220 node1/crm: migrate service 'vm:101' to node 'node2'
+info 220 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 225 node3/lrm: service vm:101 - start migrate to node 'node2'
+info 225 node3/lrm: service vm:101 - end migrate to node 'node2'
+info 240 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 240 node1/crm: migrate service 'vm:101' to node 'node3' (running)
+info 240 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 243 node2/lrm: service vm:101 - start migrate to node 'node3'
+info 243 node2/lrm: service vm:101 - end migrate to node 'node3'
+info 260 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
+info 265 node3/lrm: starting service vm:101
+info 265 node3/lrm: service status vm:101 started
+info 320 cmdlist: execute service vm:101 migrate node3
+info 320 node1/crm: ignore crm command - service already on target node: migrate vm:101 node3
+info 420 cmdlist: execute service vm:102 migrate node3
+info 420 node1/crm: got crm command: migrate vm:102 node3
+info 420 node1/crm: migrate service 'vm:102' to node 'node3'
+info 420 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 423 node2/lrm: service vm:102 - start migrate to node 'node3'
+info 423 node2/lrm: service vm:102 - end migrate to node 'node3'
+info 440 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3)
+info 445 node3/lrm: starting service vm:102
+info 445 node3/lrm: service status vm:102 started
+info 520 cmdlist: execute service vm:102 migrate node2
+info 520 node1/crm: got crm command: migrate vm:102 node2
+info 520 node1/crm: migrate service 'vm:102' to node 'node2'
+info 520 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 525 node3/lrm: service vm:102 - start migrate to node 'node2'
+info 525 node3/lrm: service vm:102 - end migrate to node 'node2'
+info 540 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
+info 543 node2/lrm: starting service vm:102
+info 543 node2/lrm: service status vm:102 started
+info 620 cmdlist: execute service vm:102 migrate node1
+info 620 node1/crm: got crm command: migrate vm:102 node1
+info 620 node1/crm: migrate service 'vm:102' to node 'node1'
+info 620 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 623 node2/lrm: service vm:102 - start migrate to node 'node1'
+info 623 node2/lrm: service vm:102 - end migrate to node 'node1'
+info 640 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
+info 641 node1/lrm: starting service vm:102
+info 641 node1/lrm: service status vm:102 started
+info 1220 hardware: exit simulation - done
diff --git a/src/test/test-node-affinity-nonstrict7/manager_status b/src/test/test-node-affinity-nonstrict7/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-node-affinity-nonstrict7/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-node-affinity-nonstrict7/rules_config b/src/test/test-node-affinity-nonstrict7/rules_config
new file mode 100644
index 00000000..8aa2c589
--- /dev/null
+++ b/src/test/test-node-affinity-nonstrict7/rules_config
@@ -0,0 +1,7 @@
+node-affinity: vm101-should-be-on-node1-node3
+ nodes node1:1,node3:2
+ resources vm:101
+
+node-affinity: vm102-should-be-on-node1-node2
+ nodes node1:1,node2:2
+ resources vm:102
diff --git a/src/test/test-node-affinity-nonstrict7/service_config b/src/test/test-node-affinity-nonstrict7/service_config
new file mode 100644
index 00000000..3a916390
--- /dev/null
+++ b/src/test/test-node-affinity-nonstrict7/service_config
@@ -0,0 +1,4 @@
+{
+ "vm:101": { "node": "node3", "state": "started", "failback": 1 },
+ "vm:102": { "node": "node2", "state": "started", "failback": 0 }
+}
diff --git a/src/test/test-node-affinity-strict7/README b/src/test/test-node-affinity-strict7/README
new file mode 100644
index 00000000..bc0096f5
--- /dev/null
+++ b/src/test/test-node-affinity-strict7/README
@@ -0,0 +1,9 @@
+Test whether services in a strict node affinity rule handle manual migrations
+to nodes as expected with respect to whether these are part of the node
+affinity rule or not.
+
+The test scenario is:
+- vm:101 must be kept on node1 or node3 (preferred)
+- vm:102 must be kept on node1 or node2 (preferred)
+- vm:101 is running on node3 with failback enabled
+- vm:102 is running on node3 with failback disabled
diff --git a/src/test/test-node-affinity-strict7/cmdlist b/src/test/test-node-affinity-strict7/cmdlist
new file mode 100644
index 00000000..d992c805
--- /dev/null
+++ b/src/test/test-node-affinity-strict7/cmdlist
@@ -0,0 +1,9 @@
+[
+ [ "power node1 on", "power node2 on", "power node3 on"],
+ [ "service vm:101 migrate node1" ],
+ [ "service vm:101 migrate node2" ],
+ [ "service vm:101 migrate node3" ],
+ [ "service vm:102 migrate node3" ],
+ [ "service vm:102 migrate node2" ],
+ [ "service vm:102 migrate node1" ]
+]
diff --git a/src/test/test-node-affinity-strict7/hardware_status b/src/test/test-node-affinity-strict7/hardware_status
new file mode 100644
index 00000000..451beb13
--- /dev/null
+++ b/src/test/test-node-affinity-strict7/hardware_status
@@ -0,0 +1,5 @@
+{
+ "node1": { "power": "off", "network": "off" },
+ "node2": { "power": "off", "network": "off" },
+ "node3": { "power": "off", "network": "off" }
+}
diff --git a/src/test/test-node-affinity-strict7/log.expect b/src/test/test-node-affinity-strict7/log.expect
new file mode 100644
index 00000000..cbe9f323
--- /dev/null
+++ b/src/test/test-node-affinity-strict7/log.expect
@@ -0,0 +1,87 @@
+info 0 hardware: starting simulation
+info 20 cmdlist: execute power node1 on
+info 20 node1/crm: status change startup => wait_for_quorum
+info 20 node1/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node2 on
+info 20 node2/crm: status change startup => wait_for_quorum
+info 20 node2/lrm: status change startup => wait_for_agent_lock
+info 20 cmdlist: execute power node3 on
+info 20 node3/crm: status change startup => wait_for_quorum
+info 20 node3/lrm: status change startup => wait_for_agent_lock
+info 20 node1/crm: got lock 'ha_manager_lock'
+info 20 node1/crm: status change wait_for_quorum => master
+info 20 node1/crm: node 'node1': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node2': state changed from 'unknown' => 'online'
+info 20 node1/crm: node 'node3': state changed from 'unknown' => 'online'
+info 20 node1/crm: adding new service 'vm:101' on node 'node3'
+info 20 node1/crm: adding new service 'vm:102' on node 'node2'
+info 20 node1/crm: service 'vm:101': state changed from 'request_start' to 'started' (node = node3)
+info 20 node1/crm: service 'vm:102': state changed from 'request_start' to 'started' (node = node2)
+info 22 node2/crm: status change wait_for_quorum => slave
+info 23 node2/lrm: got lock 'ha_agent_node2_lock'
+info 23 node2/lrm: status change wait_for_agent_lock => active
+info 23 node2/lrm: starting service vm:102
+info 23 node2/lrm: service status vm:102 started
+info 24 node3/crm: status change wait_for_quorum => slave
+info 25 node3/lrm: got lock 'ha_agent_node3_lock'
+info 25 node3/lrm: status change wait_for_agent_lock => active
+info 25 node3/lrm: starting service vm:101
+info 25 node3/lrm: service status vm:101 started
+info 120 cmdlist: execute service vm:101 migrate node1
+info 120 node1/crm: got crm command: migrate vm:101 node1
+info 120 node1/crm: migrate service 'vm:101' to node 'node1'
+info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node1)
+info 121 node1/lrm: got lock 'ha_agent_node1_lock'
+info 121 node1/lrm: status change wait_for_agent_lock => active
+info 125 node3/lrm: service vm:101 - start migrate to node 'node1'
+info 125 node3/lrm: service vm:101 - end migrate to node 'node1'
+info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
+info 140 node1/crm: migrate service 'vm:101' to node 'node3' (running)
+info 140 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3)
+info 141 node1/lrm: service vm:101 - start migrate to node 'node3'
+info 141 node1/lrm: service vm:101 - end migrate to node 'node3'
+info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
+info 165 node3/lrm: starting service vm:101
+info 165 node3/lrm: service status vm:101 started
+info 220 cmdlist: execute service vm:101 migrate node2
+info 220 node1/crm: got crm command: migrate vm:101 node2
+info 220 node1/crm: migrate service 'vm:101' to node 'node2'
+info 220 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 225 node3/lrm: service vm:101 - start migrate to node 'node2'
+info 225 node3/lrm: service vm:101 - end migrate to node 'node2'
+info 240 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
+info 240 node1/crm: migrate service 'vm:101' to node 'node3' (running)
+info 240 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 243 node2/lrm: service vm:101 - start migrate to node 'node3'
+info 243 node2/lrm: service vm:101 - end migrate to node 'node3'
+info 260 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
+info 265 node3/lrm: starting service vm:101
+info 265 node3/lrm: service status vm:101 started
+info 320 cmdlist: execute service vm:101 migrate node3
+info 320 node1/crm: ignore crm command - service already on target node: migrate vm:101 node3
+info 420 cmdlist: execute service vm:102 migrate node3
+info 420 node1/crm: got crm command: migrate vm:102 node3
+info 420 node1/crm: migrate service 'vm:102' to node 'node3'
+info 420 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node3)
+info 423 node2/lrm: service vm:102 - start migrate to node 'node3'
+info 423 node2/lrm: service vm:102 - end migrate to node 'node3'
+info 440 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3)
+info 440 node1/crm: migrate service 'vm:102' to node 'node2' (running)
+info 440 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node3, target = node2)
+info 445 node3/lrm: service vm:102 - start migrate to node 'node2'
+info 445 node3/lrm: service vm:102 - end migrate to node 'node2'
+info 460 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
+info 463 node2/lrm: starting service vm:102
+info 463 node2/lrm: service status vm:102 started
+info 520 cmdlist: execute service vm:102 migrate node2
+info 520 node1/crm: ignore crm command - service already on target node: migrate vm:102 node2
+info 620 cmdlist: execute service vm:102 migrate node1
+info 620 node1/crm: got crm command: migrate vm:102 node1
+info 620 node1/crm: migrate service 'vm:102' to node 'node1'
+info 620 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 623 node2/lrm: service vm:102 - start migrate to node 'node1'
+info 623 node2/lrm: service vm:102 - end migrate to node 'node1'
+info 640 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
+info 641 node1/lrm: starting service vm:102
+info 641 node1/lrm: service status vm:102 started
+info 1220 hardware: exit simulation - done
diff --git a/src/test/test-node-affinity-strict7/manager_status b/src/test/test-node-affinity-strict7/manager_status
new file mode 100644
index 00000000..9e26dfee
--- /dev/null
+++ b/src/test/test-node-affinity-strict7/manager_status
@@ -0,0 +1 @@
+{}
\ No newline at end of file
diff --git a/src/test/test-node-affinity-strict7/rules_config b/src/test/test-node-affinity-strict7/rules_config
new file mode 100644
index 00000000..622ba80b
--- /dev/null
+++ b/src/test/test-node-affinity-strict7/rules_config
@@ -0,0 +1,9 @@
+node-affinity: vm101-must-be-on-node1-node3
+ nodes node1:1,node3:2
+ resources vm:101
+ strict 1
+
+node-affinity: vm102-must-be-on-node1-node2
+ nodes node1:1,node2:2
+ resources vm:102
+ strict 1
diff --git a/src/test/test-node-affinity-strict7/service_config b/src/test/test-node-affinity-strict7/service_config
new file mode 100644
index 00000000..3a916390
--- /dev/null
+++ b/src/test/test-node-affinity-strict7/service_config
@@ -0,0 +1,4 @@
+{
+ "vm:101": { "node": "node3", "state": "started", "failback": 1 },
+ "vm:102": { "node": "node2", "state": "started", "failback": 0 }
+}
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 8/9] handle strict node affinity rules in manual migrations
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (6 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 7/9] tests: add test cases for migrating resources with node affinity rules Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 9/9] handle node affinity rules with failback " Daniel Kral
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Do not execute any manual user migration of an HA resource to a target
node, where it is not allowed to be on according to the strict node
affinity rule it is part of.
This prevents users from moving an HA resource, which would be migrated
back to an allowed member node of the strict node affinity rule
immediately after, which just wastes time and resources.
This new information is only redirected to the ha_manager's CLI
stdout/stderr and the HA Manager node's syslog respectively, so other
user-facing endpoints needs to implement this logic as well to give
users adequate feedback why migrations are not executed.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
src/PVE/API2/HA/Resources.pm | 4 +--
src/PVE/CLI/ha_manager.pm | 14 +++++-----
src/PVE/HA/Helpers.pm | 13 ++++++++-
src/PVE/HA/Manager.pm | 7 +++--
.../test-node-affinity-strict1/log.expect | 16 +----------
.../test-node-affinity-strict2/log.expect | 16 +----------
.../test-node-affinity-strict7/log.expect | 28 ++-----------------
src/test/test-recovery4/log.expect | 2 +-
8 files changed, 31 insertions(+), 69 deletions(-)
diff --git a/src/PVE/API2/HA/Resources.pm b/src/PVE/API2/HA/Resources.pm
index b95c0e1f..51784935 100644
--- a/src/PVE/API2/HA/Resources.pm
+++ b/src/PVE/API2/HA/Resources.pm
@@ -377,7 +377,7 @@ __PACKAGE__->register_method({
type => 'string',
description => "The reason why the HA resource is"
. " blocking the migration.",
- enum => ['resource-affinity'],
+ enum => ['node-affinity', 'resource-affinity'],
},
},
},
@@ -479,7 +479,7 @@ __PACKAGE__->register_method({
type => 'string',
description => "The reason why the HA resource is"
. " blocking the relocation.",
- enum => ['resource-affinity'],
+ enum => ['node-affinity', 'resource-affinity'],
},
},
},
diff --git a/src/PVE/CLI/ha_manager.pm b/src/PVE/CLI/ha_manager.pm
index bccb4438..5c6cee02 100644
--- a/src/PVE/CLI/ha_manager.pm
+++ b/src/PVE/CLI/ha_manager.pm
@@ -160,15 +160,15 @@ my $print_resource_motion_output = sub {
my $err_msg = "cannot $cmd resource '$sid' to node '$req_node':\n\n";
for my $blocking_resource (@$blocking_resources) {
- my ($csid, $cause) = $blocking_resource->@{qw(sid cause)};
+ my $cause = $blocking_resource->{cause};
- $err_msg .= "- resource '$csid' on target node '$req_node'";
-
- if ($cause eq 'resource-affinity') {
- $err_msg .= " in negative affinity with resource '$sid'";
+ if ($cause eq 'node-affinity') {
+ $err_msg .= "- resource '$sid' not allowed on target node '$req_node'\n";
+ } elsif ($cause eq 'resource-affinity') {
+ my $csid = $blocking_resource->{sid};
+ $err_msg .= "- resource '$csid' on target node '$req_node'"
+ . " in negative affinity with resource '$sid'\n";
}
-
- $err_msg .= "\n";
}
die $err_msg;
diff --git a/src/PVE/HA/Helpers.pm b/src/PVE/HA/Helpers.pm
index 09300cd4..b160c541 100644
--- a/src/PVE/HA/Helpers.pm
+++ b/src/PVE/HA/Helpers.pm
@@ -2,6 +2,7 @@ package PVE::HA::Helpers;
use v5.36;
+use PVE::HA::Rules::NodeAffinity qw(get_node_affinity);
use PVE::HA::Rules::ResourceAffinity qw(get_affinitive_resources);
=head3 get_resource_motion_info
@@ -21,7 +22,9 @@ sub get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules) {
my $dependent_resources = [];
my $blocking_resources_by_node = {};
- my $resource_affinity = $compiled_rules->{'resource-affinity'};
+ my ($node_affinity, $resource_affinity) =
+ $compiled_rules->@{qw(node-affinity resource-affinity)};
+ my ($allowed_nodes) = get_node_affinity($node_affinity, $sid, $online_nodes);
my ($together, $separate) = get_affinitive_resources($resource_affinity, $sid);
for my $csid (sort keys %$together) {
@@ -32,6 +35,14 @@ sub get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules) {
}
for my $node (keys %$online_nodes) {
+ if (!$allowed_nodes->{$node}) {
+ push $blocking_resources_by_node->{$node}->@*,
+ {
+ sid => $sid,
+ cause => 'node-affinity',
+ };
+ }
+
for my $csid (sort keys %$separate) {
next if !defined($ss->{$csid});
next if $ss->{$csid}->{state} eq 'ignored';
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index 470df92c..d1ff9615 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -398,9 +398,12 @@ sub execute_migration {
if (my $blocking_resources = $blocking_resources_by_node->{$target}) {
for my $blocking_resource (@$blocking_resources) {
my $err_msg = "unknown migration blocker reason";
- my ($csid, $cause) = $blocking_resource->@{qw(sid cause)};
+ my $cause = $blocking_resource->{cause};
- if ($cause eq 'resource-affinity') {
+ if ($cause eq 'node-affinity') {
+ $err_msg = "service '$sid' is not allowed on node '$target'";
+ } elsif ($cause eq 'resource-affinity') {
+ my $csid = $blocking_resource->{sid};
$err_msg = "service '$csid' on node '$target' in negative"
. " affinity with service '$sid'";
}
diff --git a/src/test/test-node-affinity-strict1/log.expect b/src/test/test-node-affinity-strict1/log.expect
index d86c69de..ca2c40b3 100644
--- a/src/test/test-node-affinity-strict1/log.expect
+++ b/src/test/test-node-affinity-strict1/log.expect
@@ -22,19 +22,5 @@ info 25 node3/lrm: status change wait_for_agent_lock => active
info 25 node3/lrm: starting service vm:101
info 25 node3/lrm: service status vm:101 started
info 120 cmdlist: execute service vm:101 migrate node2
-info 120 node1/crm: got crm command: migrate vm:101 node2
-info 120 node1/crm: migrate service 'vm:101' to node 'node2'
-info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
-info 123 node2/lrm: got lock 'ha_agent_node2_lock'
-info 123 node2/lrm: status change wait_for_agent_lock => active
-info 125 node3/lrm: service vm:101 - start migrate to node 'node2'
-info 125 node3/lrm: service vm:101 - end migrate to node 'node2'
-info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
-info 140 node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info 140 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 143 node2/lrm: service vm:101 - start migrate to node 'node3'
-info 143 node2/lrm: service vm:101 - end migrate to node 'node3'
-info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
-info 165 node3/lrm: starting service vm:101
-info 165 node3/lrm: service status vm:101 started
+err 120 node1/crm: crm command 'migrate vm:101 node2' error - service 'vm:101' is not allowed on node 'node2'
info 720 hardware: exit simulation - done
diff --git a/src/test/test-node-affinity-strict2/log.expect b/src/test/test-node-affinity-strict2/log.expect
index d86c69de..ca2c40b3 100644
--- a/src/test/test-node-affinity-strict2/log.expect
+++ b/src/test/test-node-affinity-strict2/log.expect
@@ -22,19 +22,5 @@ info 25 node3/lrm: status change wait_for_agent_lock => active
info 25 node3/lrm: starting service vm:101
info 25 node3/lrm: service status vm:101 started
info 120 cmdlist: execute service vm:101 migrate node2
-info 120 node1/crm: got crm command: migrate vm:101 node2
-info 120 node1/crm: migrate service 'vm:101' to node 'node2'
-info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
-info 123 node2/lrm: got lock 'ha_agent_node2_lock'
-info 123 node2/lrm: status change wait_for_agent_lock => active
-info 125 node3/lrm: service vm:101 - start migrate to node 'node2'
-info 125 node3/lrm: service vm:101 - end migrate to node 'node2'
-info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
-info 140 node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info 140 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 143 node2/lrm: service vm:101 - start migrate to node 'node3'
-info 143 node2/lrm: service vm:101 - end migrate to node 'node3'
-info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
-info 165 node3/lrm: starting service vm:101
-info 165 node3/lrm: service status vm:101 started
+err 120 node1/crm: crm command 'migrate vm:101 node2' error - service 'vm:101' is not allowed on node 'node2'
info 720 hardware: exit simulation - done
diff --git a/src/test/test-node-affinity-strict7/log.expect b/src/test/test-node-affinity-strict7/log.expect
index cbe9f323..9c4e9f0b 100644
--- a/src/test/test-node-affinity-strict7/log.expect
+++ b/src/test/test-node-affinity-strict7/log.expect
@@ -44,35 +44,11 @@ info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'sta
info 165 node3/lrm: starting service vm:101
info 165 node3/lrm: service status vm:101 started
info 220 cmdlist: execute service vm:101 migrate node2
-info 220 node1/crm: got crm command: migrate vm:101 node2
-info 220 node1/crm: migrate service 'vm:101' to node 'node2'
-info 220 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
-info 225 node3/lrm: service vm:101 - start migrate to node 'node2'
-info 225 node3/lrm: service vm:101 - end migrate to node 'node2'
-info 240 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
-info 240 node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info 240 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 243 node2/lrm: service vm:101 - start migrate to node 'node3'
-info 243 node2/lrm: service vm:101 - end migrate to node 'node3'
-info 260 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
-info 265 node3/lrm: starting service vm:101
-info 265 node3/lrm: service status vm:101 started
+err 220 node1/crm: crm command 'migrate vm:101 node2' error - service 'vm:101' is not allowed on node 'node2'
info 320 cmdlist: execute service vm:101 migrate node3
info 320 node1/crm: ignore crm command - service already on target node: migrate vm:101 node3
info 420 cmdlist: execute service vm:102 migrate node3
-info 420 node1/crm: got crm command: migrate vm:102 node3
-info 420 node1/crm: migrate service 'vm:102' to node 'node3'
-info 420 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 423 node2/lrm: service vm:102 - start migrate to node 'node3'
-info 423 node2/lrm: service vm:102 - end migrate to node 'node3'
-info 440 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node3)
-info 440 node1/crm: migrate service 'vm:102' to node 'node2' (running)
-info 440 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node3, target = node2)
-info 445 node3/lrm: service vm:102 - start migrate to node 'node2'
-info 445 node3/lrm: service vm:102 - end migrate to node 'node2'
-info 460 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node2)
-info 463 node2/lrm: starting service vm:102
-info 463 node2/lrm: service status vm:102 started
+err 420 node1/crm: crm command 'migrate vm:102 node3' error - service 'vm:102' is not allowed on node 'node3'
info 520 cmdlist: execute service vm:102 migrate node2
info 520 node1/crm: ignore crm command - service already on target node: migrate vm:102 node2
info 620 cmdlist: execute service vm:102 migrate node1
diff --git a/src/test/test-recovery4/log.expect b/src/test/test-recovery4/log.expect
index 12983b5f..684c796b 100644
--- a/src/test/test-recovery4/log.expect
+++ b/src/test/test-recovery4/log.expect
@@ -43,7 +43,7 @@ err 260 node1/crm: recovering service 'vm:102' from fenced node 'node2' f
err 280 node1/crm: recovering service 'vm:102' from fenced node 'node2' failed, no recovery node found
err 300 node1/crm: recovering service 'vm:102' from fenced node 'node2' failed, no recovery node found
info 320 cmdlist: execute service vm:102 migrate node3
-info 320 node1/crm: got crm command: migrate vm:102 node3
+err 320 node1/crm: crm command 'migrate vm:102 node3' error - service 'vm:102' is not allowed on node 'node3'
err 320 node1/crm: recovering service 'vm:102' from fenced node 'node2' failed, no recovery node found
err 340 node1/crm: recovering service 'vm:102' from fenced node 'node2' failed, no recovery node found
err 360 node1/crm: recovering service 'vm:102' from fenced node 'node2' failed, no recovery node found
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH ha-manager 9/9] handle node affinity rules with failback in manual migrations
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (7 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 8/9] handle strict node affinity rules in manual migrations Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH qemu-server 1/1] api: migration preconditions: add node affinity as blocking cause Daniel Kral
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Do not execute any manual user migration of an HA resource to a target
node, which is not one of the highest priority nodes if the HA resource
has failback set.
This prevents users from moving an HA resource, which would be failed
back to a higher priority node of the strict or non-strict node affinity
rule immediately after, which just wastes time and resources.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
I thought about exposing the service configurations hash ($sc) through
$self->{sc} in the HA Manager instead, as we already did with
$self->{groups} and $self->{rules} / $self->{compiled_rules}, but I left
it to passing it to the appropriate routines for now.
This would be a rather nice cleanup in the future to have these parts
separated in something like the Resources module, where the config and
the state of HA resources is controlled instead of having all of this
logic in the Manager module, but there are bit more important things to
do right now.
src/PVE/HA/Config.pm | 11 +++++--
src/PVE/HA/Helpers.pm | 6 ++--
src/PVE/HA/Manager.pm | 13 +++++---
.../test-node-affinity-nonstrict1/log.expect | 16 +---------
.../test-node-affinity-nonstrict7/log.expect | 32 +++----------------
.../test-node-affinity-strict7/log.expect | 18 ++---------
6 files changed, 27 insertions(+), 69 deletions(-)
diff --git a/src/PVE/HA/Config.pm b/src/PVE/HA/Config.pm
index f8c5965e..fa14816c 100644
--- a/src/PVE/HA/Config.pm
+++ b/src/PVE/HA/Config.pm
@@ -382,22 +382,27 @@ sub service_is_configured {
sub get_resource_motion_info {
my ($sid) = @_;
- my $resources = read_resources_config();
+ my $conf = read_resources_config();
my $dependent_resources = [];
my $blocking_resources_by_node = {};
- if (&$service_check_ha_state($resources, $sid)) {
+ if (&$service_check_ha_state($conf, $sid)) {
my $manager_status = read_manager_status();
my $ss = $manager_status->{service_status};
my $ns = $manager_status->{node_status};
# get_resource_motion_info expects a hashset of all nodes with status 'online'
my $online_nodes = { map { $ns->{$_} eq 'online' ? ($_ => 1) : () } keys %$ns };
+ # get_resource_motion_info expects a resource config with defaults set
+ my $resources = read_and_check_resources_config();
my $compiled_rules = read_and_compile_rules_config();
+ my $cd = $resources->{$sid} // {};
($dependent_resources, $blocking_resources_by_node) =
- PVE::HA::Helpers::get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules);
+ PVE::HA::Helpers::get_resource_motion_info(
+ $ss, $sid, $cd, $online_nodes, $compiled_rules,
+ );
}
return ($dependent_resources, $blocking_resources_by_node);
diff --git a/src/PVE/HA/Helpers.pm b/src/PVE/HA/Helpers.pm
index b160c541..a58b1e12 100644
--- a/src/PVE/HA/Helpers.pm
+++ b/src/PVE/HA/Helpers.pm
@@ -18,13 +18,13 @@ causes that make the node unavailable to C<$sid>.
=cut
-sub get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules) {
+sub get_resource_motion_info($ss, $sid, $cd, $online_nodes, $compiled_rules) {
my $dependent_resources = [];
my $blocking_resources_by_node = {};
my ($node_affinity, $resource_affinity) =
$compiled_rules->@{qw(node-affinity resource-affinity)};
- my ($allowed_nodes) = get_node_affinity($node_affinity, $sid, $online_nodes);
+ my ($allowed_nodes, $pri_nodes) = get_node_affinity($node_affinity, $sid, $online_nodes);
my ($together, $separate) = get_affinitive_resources($resource_affinity, $sid);
for my $csid (sort keys %$together) {
@@ -35,7 +35,7 @@ sub get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules) {
}
for my $node (keys %$online_nodes) {
- if (!$allowed_nodes->{$node}) {
+ if (!$allowed_nodes->{$node} || ($cd->{failback} && !$pri_nodes->{$node})) {
push $blocking_resources_by_node->{$node}->@*,
{
sid => $sid,
diff --git a/src/PVE/HA/Manager.pm b/src/PVE/HA/Manager.pm
index d1ff9615..9067d27b 100644
--- a/src/PVE/HA/Manager.pm
+++ b/src/PVE/HA/Manager.pm
@@ -387,13 +387,15 @@ sub read_lrm_status {
}
sub execute_migration {
- my ($self, $cmd, $task, $sid, $target) = @_;
+ my ($self, $cmd, $task, $sid, $cd, $target) = @_;
my ($haenv, $ss, $ns, $compiled_rules) = $self->@{qw(haenv ss ns compiled_rules)};
my $online_nodes = { map { $_ => 1 } $self->{ns}->list_online_nodes()->@* };
my ($dependent_resources, $blocking_resources_by_node) =
- PVE::HA::Helpers::get_resource_motion_info($ss, $sid, $online_nodes, $compiled_rules);
+ PVE::HA::Helpers::get_resource_motion_info(
+ $ss, $sid, $cd, $online_nodes, $compiled_rules,
+ );
if (my $blocking_resources = $blocking_resources_by_node->{$target}) {
for my $blocking_resource (@$blocking_resources) {
@@ -432,7 +434,7 @@ sub execute_migration {
# read new crm commands and save them into crm master status
sub update_crm_commands {
- my ($self) = @_;
+ my ($self, $sc) = @_;
my ($haenv, $ms, $ns, $ss) = ($self->{haenv}, $self->{ms}, $self->{ns}, $self->{ss});
@@ -453,7 +455,8 @@ sub update_crm_commands {
"ignore crm command - service already on target node: $cmd",
);
} else {
- $self->execute_migration($cmd, $task, $sid, $node);
+ my $cd = $sc->{$sid} // {};
+ $self->execute_migration($cmd, $task, $sid, $cd, $node);
}
}
} else {
@@ -707,7 +710,7 @@ sub manage {
$self->{last_services_digest} = $services_digest;
}
- $self->update_crm_commands();
+ $self->update_crm_commands($sc);
for (;;) {
my $repeat = 0;
diff --git a/src/test/test-node-affinity-nonstrict1/log.expect b/src/test/test-node-affinity-nonstrict1/log.expect
index d86c69de..ca2c40b3 100644
--- a/src/test/test-node-affinity-nonstrict1/log.expect
+++ b/src/test/test-node-affinity-nonstrict1/log.expect
@@ -22,19 +22,5 @@ info 25 node3/lrm: status change wait_for_agent_lock => active
info 25 node3/lrm: starting service vm:101
info 25 node3/lrm: service status vm:101 started
info 120 cmdlist: execute service vm:101 migrate node2
-info 120 node1/crm: got crm command: migrate vm:101 node2
-info 120 node1/crm: migrate service 'vm:101' to node 'node2'
-info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
-info 123 node2/lrm: got lock 'ha_agent_node2_lock'
-info 123 node2/lrm: status change wait_for_agent_lock => active
-info 125 node3/lrm: service vm:101 - start migrate to node 'node2'
-info 125 node3/lrm: service vm:101 - end migrate to node 'node2'
-info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
-info 140 node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info 140 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 143 node2/lrm: service vm:101 - start migrate to node 'node3'
-info 143 node2/lrm: service vm:101 - end migrate to node 'node3'
-info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
-info 165 node3/lrm: starting service vm:101
-info 165 node3/lrm: service status vm:101 started
+err 120 node1/crm: crm command 'migrate vm:101 node2' error - service 'vm:101' is not allowed on node 'node2'
info 720 hardware: exit simulation - done
diff --git a/src/test/test-node-affinity-nonstrict7/log.expect b/src/test/test-node-affinity-nonstrict7/log.expect
index 31daa618..54e824ea 100644
--- a/src/test/test-node-affinity-nonstrict7/log.expect
+++ b/src/test/test-node-affinity-nonstrict7/log.expect
@@ -28,35 +28,9 @@ info 25 node3/lrm: status change wait_for_agent_lock => active
info 25 node3/lrm: starting service vm:101
info 25 node3/lrm: service status vm:101 started
info 120 cmdlist: execute service vm:101 migrate node1
-info 120 node1/crm: got crm command: migrate vm:101 node1
-info 120 node1/crm: migrate service 'vm:101' to node 'node1'
-info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node1)
-info 121 node1/lrm: got lock 'ha_agent_node1_lock'
-info 121 node1/lrm: status change wait_for_agent_lock => active
-info 125 node3/lrm: service vm:101 - start migrate to node 'node1'
-info 125 node3/lrm: service vm:101 - end migrate to node 'node1'
-info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
-info 140 node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info 140 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3)
-info 141 node1/lrm: service vm:101 - start migrate to node 'node3'
-info 141 node1/lrm: service vm:101 - end migrate to node 'node3'
-info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
-info 165 node3/lrm: starting service vm:101
-info 165 node3/lrm: service status vm:101 started
+err 120 node1/crm: crm command 'migrate vm:101 node1' error - service 'vm:101' is not allowed on node 'node1'
info 220 cmdlist: execute service vm:101 migrate node2
-info 220 node1/crm: got crm command: migrate vm:101 node2
-info 220 node1/crm: migrate service 'vm:101' to node 'node2'
-info 220 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node2)
-info 225 node3/lrm: service vm:101 - start migrate to node 'node2'
-info 225 node3/lrm: service vm:101 - end migrate to node 'node2'
-info 240 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node2)
-info 240 node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info 240 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node2, target = node3)
-info 243 node2/lrm: service vm:101 - start migrate to node 'node3'
-info 243 node2/lrm: service vm:101 - end migrate to node 'node3'
-info 260 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
-info 265 node3/lrm: starting service vm:101
-info 265 node3/lrm: service status vm:101 started
+err 220 node1/crm: crm command 'migrate vm:101 node2' error - service 'vm:101' is not allowed on node 'node2'
info 320 cmdlist: execute service vm:101 migrate node3
info 320 node1/crm: ignore crm command - service already on target node: migrate vm:101 node3
info 420 cmdlist: execute service vm:102 migrate node3
@@ -81,6 +55,8 @@ info 620 cmdlist: execute service vm:102 migrate node1
info 620 node1/crm: got crm command: migrate vm:102 node1
info 620 node1/crm: migrate service 'vm:102' to node 'node1'
info 620 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 621 node1/lrm: got lock 'ha_agent_node1_lock'
+info 621 node1/lrm: status change wait_for_agent_lock => active
info 623 node2/lrm: service vm:102 - start migrate to node 'node1'
info 623 node2/lrm: service vm:102 - end migrate to node 'node1'
info 640 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
diff --git a/src/test/test-node-affinity-strict7/log.expect b/src/test/test-node-affinity-strict7/log.expect
index 9c4e9f0b..ae8e43fb 100644
--- a/src/test/test-node-affinity-strict7/log.expect
+++ b/src/test/test-node-affinity-strict7/log.expect
@@ -28,21 +28,7 @@ info 25 node3/lrm: status change wait_for_agent_lock => active
info 25 node3/lrm: starting service vm:101
info 25 node3/lrm: service status vm:101 started
info 120 cmdlist: execute service vm:101 migrate node1
-info 120 node1/crm: got crm command: migrate vm:101 node1
-info 120 node1/crm: migrate service 'vm:101' to node 'node1'
-info 120 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node3, target = node1)
-info 121 node1/lrm: got lock 'ha_agent_node1_lock'
-info 121 node1/lrm: status change wait_for_agent_lock => active
-info 125 node3/lrm: service vm:101 - start migrate to node 'node1'
-info 125 node3/lrm: service vm:101 - end migrate to node 'node1'
-info 140 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node1)
-info 140 node1/crm: migrate service 'vm:101' to node 'node3' (running)
-info 140 node1/crm: service 'vm:101': state changed from 'started' to 'migrate' (node = node1, target = node3)
-info 141 node1/lrm: service vm:101 - start migrate to node 'node3'
-info 141 node1/lrm: service vm:101 - end migrate to node 'node3'
-info 160 node1/crm: service 'vm:101': state changed from 'migrate' to 'started' (node = node3)
-info 165 node3/lrm: starting service vm:101
-info 165 node3/lrm: service status vm:101 started
+err 120 node1/crm: crm command 'migrate vm:101 node1' error - service 'vm:101' is not allowed on node 'node1'
info 220 cmdlist: execute service vm:101 migrate node2
err 220 node1/crm: crm command 'migrate vm:101 node2' error - service 'vm:101' is not allowed on node 'node2'
info 320 cmdlist: execute service vm:101 migrate node3
@@ -55,6 +41,8 @@ info 620 cmdlist: execute service vm:102 migrate node1
info 620 node1/crm: got crm command: migrate vm:102 node1
info 620 node1/crm: migrate service 'vm:102' to node 'node1'
info 620 node1/crm: service 'vm:102': state changed from 'started' to 'migrate' (node = node2, target = node1)
+info 621 node1/lrm: got lock 'ha_agent_node1_lock'
+info 621 node1/lrm: status change wait_for_agent_lock => active
info 623 node2/lrm: service vm:102 - start migrate to node 'node1'
info 623 node2/lrm: service vm:102 - end migrate to node 'node1'
info 640 node1/crm: service 'vm:102': state changed from 'migrate' to 'started' (node = node1)
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH qemu-server 1/1] api: migration preconditions: add node affinity as blocking cause
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (8 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 9/9] handle node affinity rules with failback " Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH container " Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH manager 1/1] ui: migrate: display precondition messages for ha node affinity Daniel Kral
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
Needs a version bump for pve-ha-manager.
src/PVE/API2/Qemu.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
index 190878de..5c4f6eb3 100644
--- a/src/PVE/API2/Qemu.pm
+++ b/src/PVE/API2/Qemu.pm
@@ -5196,7 +5196,7 @@ __PACKAGE__->register_method({
type => 'string',
description => "The reason why the HA"
. " resource is blocking the migration.",
- enum => ['resource-affinity'],
+ enum => ['node-affinity', 'resource-affinity'],
},
},
},
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH container 1/1] api: migration preconditions: add node affinity as blocking cause
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (9 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH qemu-server 1/1] api: migration preconditions: add node affinity as blocking cause Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH manager 1/1] ui: migrate: display precondition messages for ha node affinity Daniel Kral
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
Needs a version bump for pve-ha-manager.
src/PVE/API2/LXC.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/PVE/API2/LXC.pm b/src/PVE/API2/LXC.pm
index 3d74f71..78e9264 100644
--- a/src/PVE/API2/LXC.pm
+++ b/src/PVE/API2/LXC.pm
@@ -1494,7 +1494,7 @@ __PACKAGE__->register_method({
type => 'string',
description => "The reason why the HA"
. " resource is blocking the migration.",
- enum => ['resource-affinity'],
+ enum => ['node-affinity', 'resource-affinity'],
},
},
},
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
* [pve-devel] [PATCH manager 1/1] ui: migrate: display precondition messages for ha node affinity
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
` (10 preceding siblings ...)
2025-12-15 15:52 ` [pve-devel] [PATCH container " Daniel Kral
@ 2025-12-15 15:52 ` Daniel Kral
11 siblings, 0 replies; 13+ messages in thread
From: Daniel Kral @ 2025-12-15 15:52 UTC (permalink / raw)
To: pve-devel
Extend the VM and container precondition check to show whether a
migration of the VM/container cannot be completed because of a node
affinity rule restricting the HA resource from being migrated to a
specific node.
The migration is blocked by the HA Manager's CLI and state machine
anyway, so this is more of an informational heads-up.
Signed-off-by: Daniel Kral <d.kral@proxmox.com>
---
www/manager6/window/Migrate.js | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/www/manager6/window/Migrate.js b/www/manager6/window/Migrate.js
index ff80c70c..8cac54ea 100644
--- a/www/manager6/window/Migrate.js
+++ b/www/manager6/window/Migrate.js
@@ -432,6 +432,11 @@ Ext.define('PVE.window.Migrate', {
),
sid,
);
+ } else if (cause === 'node-affinity') {
+ reasonText = Ext.String.format(
+ gettext('HA resource {0} is not allowed on the selected target node'),
+ sid,
+ );
} else {
reasonText = Ext.String.format(
gettext('blocking HA resource {0} on selected target node'),
@@ -522,6 +527,11 @@ Ext.define('PVE.window.Migrate', {
),
sid,
);
+ } else if (cause === 'node-affinity') {
+ reasonText = Ext.String.format(
+ gettext('HA resource {0} is not allowed on the selected target node'),
+ sid,
+ );
} else {
reasonText = Ext.String.format(
gettext('blocking HA resource {0} on selected target node'),
--
2.47.3
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-12-15 15:54 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-15 15:52 [pve-devel] [PATCH-SERIES container/ha-manager/manager/qemu-server 00/12] HA node affinity blockers (#1497) Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 1/9] ha: put source files on individual new lines Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 2/9] d/pve-ha-manager.install: remove duplicate Config.pm Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 3/9] config: group and sort use statements Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 4/9] manager: " Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 5/9] manager: report all reasons when resources are blocked from migration Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 6/9] config, manager: factor out resource motion info logic Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 7/9] tests: add test cases for migrating resources with node affinity rules Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 8/9] handle strict node affinity rules in manual migrations Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH ha-manager 9/9] handle node affinity rules with failback " Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH qemu-server 1/1] api: migration preconditions: add node affinity as blocking cause Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH container " Daniel Kral
2025-12-15 15:52 ` [pve-devel] [PATCH manager 1/1] ui: migrate: display precondition messages for ha node affinity Daniel Kral
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.