public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH-SERIES v2] some replication-related improvements
@ 2020-10-29 13:31 Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 1/8] job_status: read only after acquiring the lock Fabian Ebner
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

Is a second version for the two older series:
https://lists.proxmox.com/pipermail/pve-devel/2020-July/044550.html
https://lists.proxmox.com/pipermail/pve-devel/2020-August/044589.html

While the qemu-server patches don't require the guest-common patches,
a change to patch #7 is only relevant with the new switch_replication_target.

Changes from v1:
    * Dropped an already applied patch.
    * Dropped the patch: 'Hold the guest migration lock when changing the replication config'
      Being able to edit a job's config while it's running is not a bad thing
      as long as it doesn't cause problems. And the fact that run_replication
      acquires the guest migration lock should ensure that. It guarantees that
      a job cannot be removed while a migration is running (only *marked* for
      removal and with patch 8 we chicken out in that case) and a new job
      cannot run while a migration is running, so:
          * if it's added before the migration reads the replication
            config and the replication target is the migration target, the
            migration itself might end up running the first replication, but
            that's fine.
          * otherwise it will run for the first time after migration is done.
    * Dropped an RFC making job_status return jobs with source=target=local.
      The replication tests would need to be adapted, and there might be
      a better way than have job_status return them.
    * Added patch #8 for checking if the replication job to be used for
      migration is scheduled for removal.
    * Split up a patch, now patch #4 + #5.
    * Now that the source is updated as well, call switch_replication_job_target
      for every replicated VM (even if we migrated to a non-replication target).
      Previously that happened only in the was-VM-stolen-check in job_status
      (still happens there for VMs that were actually stolen).


guest-common:

Fabian Ebner (5):
  job_status: read only after acquiring the lock
  clarify what the source property is used for in a replication job
  also update sources in switch_replication_job_target
  create nolock variant for switch_replication_job_target
  job_status: simplify fixup of jobs for stolen guests

 PVE/ReplicationConfig.pm | 46 +++++++++++++++++-----------------------
 PVE/ReplicationState.pm  | 42 +++++++++++++++---------------------
 2 files changed, 37 insertions(+), 51 deletions(-)


qemu-server:

Fabian Ebner (3):
  Repeat check for replication target in locked section
  fix checks for transfering replication state/switching job target
  don't migrate replicated VM whose replication job is marked for
    removal

 PVE/API2/Qemu.pm   | 11 +++--------
 PVE/QemuMigrate.pm | 30 +++++++++++++++++++++++-------
 2 files changed, 26 insertions(+), 15 deletions(-)

-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH v2 guest-common 1/8] job_status: read only after acquiring the lock
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 2/8] clarify what the source property is used for in a replication job Fabian Ebner
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

to get the current replication config and have the VM list
and state object as recent as possible.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
 PVE/ReplicationState.pm | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/PVE/ReplicationState.pm b/PVE/ReplicationState.pm
index 057cae6..80d45d6 100644
--- a/PVE/ReplicationState.pm
+++ b/PVE/ReplicationState.pm
@@ -234,13 +234,11 @@ sub job_status {
 
     my $jobs = {};
 
-    my $stateobj = read_state();
-
-    my $cfg = PVE::ReplicationConfig->new();
-
-    my $vms = PVE::Cluster::get_vmlist();
-
     my $func = sub {
+	my $stateobj = read_state();
+	my $cfg = PVE::ReplicationConfig->new();
+	my $vms = PVE::Cluster::get_vmlist();
+
 	foreach my $jobid (sort keys %{$cfg->{ids}}) {
 	    my $jobcfg = $cfg->{ids}->{$jobid};
 	    my $vmid = $jobcfg->{guest};
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH v2 guest-common 2/8] clarify what the source property is used for in a replication job
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 1/8] job_status: read only after acquiring the lock Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 3/8] also update sources in switch_replication_job_target Fabian Ebner
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
 PVE/ReplicationConfig.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/PVE/ReplicationConfig.pm b/PVE/ReplicationConfig.pm
index 66ef842..65436ea 100644
--- a/PVE/ReplicationConfig.pm
+++ b/PVE/ReplicationConfig.pm
@@ -80,7 +80,7 @@ my $defaultData = {
 	    optional => 1,
 	},
 	source => {
-	    description => "Source of the replication.",
+	    description => "For internal use, to detect if the guest was stolen.",
 	    type => 'string', format => 'pve-node',
 	    optional => 1,
 	},
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH v2 guest-common 3/8] also update sources in switch_replication_job_target
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 1/8] job_status: read only after acquiring the lock Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 2/8] clarify what the source property is used for in a replication job Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 4/8] create nolock variant for switch_replication_job_target Fabian Ebner
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Changes from v1:
    * iterate over values directly

 PVE/ReplicationConfig.pm | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/PVE/ReplicationConfig.pm b/PVE/ReplicationConfig.pm
index 65436ea..9dfb9da 100644
--- a/PVE/ReplicationConfig.pm
+++ b/PVE/ReplicationConfig.pm
@@ -228,23 +228,25 @@ sub find_local_replication_job {
     return undef;
 }
 
-# switch local replication job target
+# makes old_target the new source for all local jobs of this guest
+# makes new_target the target for the single local job with target old_target
 sub switch_replication_job_target {
     my ($vmid, $old_target, $new_target) = @_;
 
-    my $transfer_job = sub {
+    my $update_jobs = sub {
 	my $cfg = PVE::ReplicationConfig->new();
-	my $jobcfg = find_local_replication_job($cfg, $vmid, $old_target);
-
-	return if !$jobcfg;
-
-	$jobcfg->{target} = $new_target;
+	foreach my $jobcfg (values %{$cfg->{ids}}) {
+	    next if $jobcfg->{guest} ne $vmid;
+	    next if $jobcfg->{type} ne 'local';
 
+	    $jobcfg->{target} = $new_target if $jobcfg->{target} eq $old_target;
+	    $jobcfg->{source} = $old_target;
+	}
 	$cfg->write();
     };
 
-    lock($transfer_job);
-};
+    lock($update_jobs);
+}
 
 sub delete_job {
     my ($jobid) = @_;
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH v2 guest-common 4/8] create nolock variant for switch_replication_job_target
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
                   ` (2 preceding siblings ...)
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 3/8] also update sources in switch_replication_job_target Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 5/8] job_status: simplify fixup of jobs for stolen guests Fabian Ebner
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

so that it can be used within job_status

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Changes from v1:
    * this was part of the following patch

 PVE/ReplicationConfig.pm | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/PVE/ReplicationConfig.pm b/PVE/ReplicationConfig.pm
index 9dfb9da..6ba87ae 100644
--- a/PVE/ReplicationConfig.pm
+++ b/PVE/ReplicationConfig.pm
@@ -230,21 +230,26 @@ sub find_local_replication_job {
 
 # makes old_target the new source for all local jobs of this guest
 # makes new_target the target for the single local job with target old_target
+sub switch_replication_job_target_nolock {
+    my ($cfg, $vmid, $old_target, $new_target) = @_;
+
+    foreach my $jobcfg (values %{$cfg->{ids}}) {
+	next if $jobcfg->{guest} ne $vmid;
+	next if $jobcfg->{type} ne 'local';
+
+	$jobcfg->{target} = $new_target if $jobcfg->{target} eq $old_target;
+	$jobcfg->{source} = $old_target;
+    }
+    $cfg->write();
+}
+
 sub switch_replication_job_target {
     my ($vmid, $old_target, $new_target) = @_;
 
     my $update_jobs = sub {
 	my $cfg = PVE::ReplicationConfig->new();
-	foreach my $jobcfg (values %{$cfg->{ids}}) {
-	    next if $jobcfg->{guest} ne $vmid;
-	    next if $jobcfg->{type} ne 'local';
-
-	    $jobcfg->{target} = $new_target if $jobcfg->{target} eq $old_target;
-	    $jobcfg->{source} = $old_target;
-	}
-	$cfg->write();
+	$cfg->switch_replication_job_target_nolock($vmid, $old_target, $new_target);
     };
-
     lock($update_jobs);
 }
 
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH v2 guest-common 5/8] job_status: simplify fixup of jobs for stolen guests
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
                   ` (3 preceding siblings ...)
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 4/8] create nolock variant for switch_replication_job_target Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 qemu-server 6/8] Repeat check for replication target in locked section Fabian Ebner
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

by using switch_replication_job_target_nolock.

If a job is scheduled for removal and the guest was
stolen, it still makes sense to correct the job entry,
which didn't happen previously.

AFAICT, this was the only user of swap_source_target_nolock.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Changes from v1:
    * split into two parts (see previous patch)

 PVE/ReplicationConfig.pm | 13 -------------
 PVE/ReplicationState.pm  | 32 +++++++++++++-------------------
 2 files changed, 13 insertions(+), 32 deletions(-)

diff --git a/PVE/ReplicationConfig.pm b/PVE/ReplicationConfig.pm
index 6ba87ae..fd856a0 100644
--- a/PVE/ReplicationConfig.pm
+++ b/PVE/ReplicationConfig.pm
@@ -279,19 +279,6 @@ sub remove_vmid_jobs {
     lock($code);
 }
 
-sub swap_source_target_nolock {
-    my ($jobid) = @_;
-
-    my $cfg = __PACKAGE__->new();
-    my $job = $cfg->{ids}->{$jobid};
-    my $tmp = $job->{source};
-    $job->{source} = $job->{target};
-    $job->{target} = $tmp;
-    $cfg->write();
-
-    return $cfg->{ids}->{$jobid};
-}
-
 package PVE::ReplicationConfig::Cluster;
 
 use base qw(PVE::ReplicationConfig);
diff --git a/PVE/ReplicationState.pm b/PVE/ReplicationState.pm
index 80d45d6..81a1b31 100644
--- a/PVE/ReplicationState.pm
+++ b/PVE/ReplicationState.pm
@@ -251,22 +251,21 @@ sub job_status {
 	    # only consider guest on local node
 	    next if $vms->{ids}->{$vmid}->{node} ne $local_node;
 
+	    # source is optional in schema, but we set it automatically
+	    if (!defined($jobcfg->{source})) {
+		$jobcfg->{source} = $local_node;
+		$cfg->write();
+	    }
+
+	    # fix jobs for stolen guest
+	    $cfg->switch_replication_job_target_nolock($vmid, $local_node, $jobcfg->{source})
+		if $local_node ne $jobcfg->{source};
+
 	    my $target = $jobcfg->{target};
-	    if (!$jobcfg->{remove_job}) {
-		# check if vm was stolen (swapped source target)
-		if ($target eq $local_node) {
-		    my $source = $jobcfg->{source};
-		    if (defined($source) && $source ne $target) {
-			$jobcfg = PVE::ReplicationConfig::swap_source_target_nolock($jobid);
-			$cfg->{ids}->{$jobid} = $jobcfg;
-		    } else {
-			# never sync to local node
-			next;
-		    }
-		}
+	    # never sync to local node
+	    next if !$jobcfg->{remove_job} && $target eq $local_node;
 
-		next if !$get_disabled && $jobcfg->{disable};
-	    }
+	    next if !$get_disabled && $jobcfg->{disable};
 
 	    my $state = extract_job_state($stateobj, $jobcfg);
 	    $jobcfg->{state} = $state;
@@ -293,11 +292,6 @@ sub job_status {
 
 	    $jobcfg->{next_sync} = $next_sync;
 
-	    if (!defined($jobcfg->{source}) || $jobcfg->{source} ne $local_node) {
-		$jobcfg->{source} = $cfg->{ids}->{$jobid}->{source} = $local_node;
-		PVE::ReplicationConfig::write($cfg);
-	    }
-
 	    $jobs->{$jobid} = $jobcfg;
 	}
     };
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH v2 qemu-server 6/8] Repeat check for replication target in locked section
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
                   ` (4 preceding siblings ...)
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 5/8] job_status: simplify fixup of jobs for stolen guests Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 qemu-server 7/8] fix checks for transfering replication state/switching job target Fabian Ebner
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

No need to warn twice, so the warning from the outside check
was removed.

Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
 PVE/API2/Qemu.pm   | 11 +++--------
 PVE/QemuMigrate.pm | 13 +++++++++++++
 2 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index e8de4ea..bef83df 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -3568,14 +3568,9 @@ __PACKAGE__->register_method({
 	    my $repl_conf = PVE::ReplicationConfig->new();
 	    my $is_replicated = $repl_conf->check_for_existing_jobs($vmid, 1);
 	    my $is_replicated_to_target = defined($repl_conf->find_local_replication_job($vmid, $target));
-	    if ($is_replicated && !$is_replicated_to_target) {
-		if ($param->{force}) {
-		    warn "WARNING: Node '$target' is not a replication target. Existing replication " .
-		         "jobs will fail after migration!\n";
-		} else {
-		    die "Cannot live-migrate replicated VM to node '$target' - not a replication target." .
-		        " Use 'force' to override.\n";
-		}
+	    if (!$param->{force} && $is_replicated && !$is_replicated_to_target) {
+		die "Cannot live-migrate replicated VM to node '$target' - not a replication " .
+		    "target. Use 'force' to override.\n";
 	    }
 	} else {
 	    warn "VM isn't running. Doing offline migration instead.\n" if $param->{online};
diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 2f4eec3..a8bcd15 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -227,6 +227,19 @@ sub prepare {
 	die "can't migrate running VM without --online\n" if !$online;
 	$running = $pid;
 
+	my $repl_conf = PVE::ReplicationConfig->new();
+	my $is_replicated = $repl_conf->check_for_existing_jobs($vmid, 1);
+	my $is_replicated_to_target = defined($repl_conf->find_local_replication_job($vmid, $self->{node}));
+	if ($is_replicated && !$is_replicated_to_target) {
+	    if ($self->{opts}->{force}) {
+		$self->log('warn', "WARNING: Node '$self->{node}' is not a replication target. Existing " .
+			           "replication jobs will fail after migration!\n");
+	    } else {
+		die "Cannot live-migrate replicated VM to node '$self->{node}' - not a replication " .
+		    "target. Use 'force' to override.\n";
+	    }
+	}
+
 	$self->{forcemachine} = PVE::QemuServer::Machine::qemu_machine_pxe($vmid, $conf);
 
 	# To support custom CPU types, we keep QEMU's "-cpu" parameter intact.
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH v2 qemu-server 7/8] fix checks for transfering replication state/switching job target
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
                   ` (5 preceding siblings ...)
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 qemu-server 6/8] Repeat check for replication target in locked section Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-10-29 13:31 ` [pve-devel] [PATCH/RFC v2 qemu-server 8/8] don't migrate replicated VM whose replication job is marked for removal Fabian Ebner
  2020-11-09  9:48 ` [pve-devel] applied-series: [PATCH-SERIES v2] some replication-related improvements Fabian Grünbichler
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

In some cases $self->{replicated_volumes} will be auto-vivified
to {} by checks like
next if $self->{replicated_volumes}->{$volid}
and then {} would evaluate to true in a boolean context.

Now the replication information is retrieved once in prepare,
and used to decide whether to make the calls or not.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Changes from v1:
    * use 'if $self->{is_replicated}' both times, so that
      even if we migrated to a non-replication target,
      the source-parameters are updated in the jobs

 PVE/QemuMigrate.pm | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index a8bcd15..3fb2850 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -220,6 +220,10 @@ sub prepare {
     # test if VM exists
     my $conf = $self->{vmconf} = PVE::QemuConfig->load_config($vmid);
 
+    my $repl_conf = PVE::ReplicationConfig->new();
+    $self->{replication_jobcfg} = $repl_conf->find_local_replication_job($vmid, $self->{node});
+    $self->{is_replicated} = $repl_conf->check_for_existing_jobs($vmid, 1);
+
     PVE::QemuConfig->check_lock($conf);
 
     my $running = 0;
@@ -227,10 +231,7 @@ sub prepare {
 	die "can't migrate running VM without --online\n" if !$online;
 	$running = $pid;
 
-	my $repl_conf = PVE::ReplicationConfig->new();
-	my $is_replicated = $repl_conf->check_for_existing_jobs($vmid, 1);
-	my $is_replicated_to_target = defined($repl_conf->find_local_replication_job($vmid, $self->{node}));
-	if ($is_replicated && !$is_replicated_to_target) {
+	if ($self->{is_replicated} && !$self->{replication_jobcfg}) {
 	    if ($self->{opts}->{force}) {
 		$self->log('warn', "WARNING: Node '$self->{node}' is not a replication target. Existing " .
 			           "replication jobs will fail after migration!\n");
@@ -362,9 +363,7 @@ sub sync_disks {
 	    });
 	}
 
-	my $rep_cfg = PVE::ReplicationConfig->new();
-	my $replication_jobcfg = $rep_cfg->find_local_replication_job($vmid, $self->{node});
-	my $replicatable_volumes = !$replication_jobcfg ? {}
+	my $replicatable_volumes = !$self->{replication_jobcfg} ? {}
 	    : PVE::QemuConfig->get_replicatable_volumes($storecfg, $vmid, $conf, 0, 1);
 
 	my $test_volid = sub {
@@ -489,7 +488,7 @@ sub sync_disks {
 	    }
 	}
 
-	if ($replication_jobcfg) {
+	if ($self->{replication_jobcfg}) {
 	    if ($self->{running}) {
 
 		my $version = PVE::QemuServer::kvm_user_version();
@@ -523,7 +522,7 @@ sub sync_disks {
 	    my $start_time = time();
 	    my $logfunc = sub { $self->log('info', shift) };
 	    $self->{replicated_volumes} = PVE::Replication::run_replication(
-	       'PVE::QemuConfig', $replication_jobcfg, $start_time, $start_time, $logfunc);
+	       'PVE::QemuConfig', $self->{replication_jobcfg}, $start_time, $start_time, $logfunc);
 	}
 
 	# sizes in config have to be accurate for remote node to correctly
@@ -1193,7 +1192,7 @@ sub phase3_cleanup {
     }
 
     # transfer replication state before move config
-    $self->transfer_replication_state() if $self->{replicated_volumes};
+    $self->transfer_replication_state() if $self->{is_replicated};
 
     # move config to remote node
     my $conffile = PVE::QemuConfig->config_file($vmid);
@@ -1202,7 +1201,7 @@ sub phase3_cleanup {
     die "Failed to move config to node '$self->{node}' - rename failed: $!\n"
         if !rename($conffile, $newconffile);
 
-    $self->switch_replication_job_target() if $self->{replicated_volumes};
+    $self->switch_replication_job_target() if $self->{is_replicated};
 
     if ($self->{livemigration}) {
 	if ($self->{stopnbd}) {
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] [PATCH/RFC v2 qemu-server 8/8] don't migrate replicated VM whose replication job is marked for removal
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
                   ` (6 preceding siblings ...)
  2020-10-29 13:31 ` [pve-devel] [PATCH v2 qemu-server 7/8] fix checks for transfering replication state/switching job target Fabian Ebner
@ 2020-10-29 13:31 ` Fabian Ebner
  2020-11-09  9:48 ` [pve-devel] applied-series: [PATCH-SERIES v2] some replication-related improvements Fabian Grünbichler
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Ebner @ 2020-10-29 13:31 UTC (permalink / raw)
  To: pve-devel

while it didn't actually fail, we probably want to avoid the behavior:

With remove_job=full:
    * run_replication called during migration causes the replicated volumes to
      be removed
    * migration continues by fully copying all volumes

With remove_job=local:
    * run_replication called during migration causes the job (and local
      replication snapshots) to be removed
    * migration continues by fully copying all volumes and renaming them to
      avoid collision with the still existing remote volumes

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

New in v2

Alternatively, we could throw out the remove_job property before calling
run_replication during migration, use the replicated volumes, and let
the scheduled pvesr call remove the job after migration

 PVE/QemuMigrate.pm | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
index 3fb2850..c6623e1 100644
--- a/PVE/QemuMigrate.pm
+++ b/PVE/QemuMigrate.pm
@@ -224,6 +224,10 @@ sub prepare {
     $self->{replication_jobcfg} = $repl_conf->find_local_replication_job($vmid, $self->{node});
     $self->{is_replicated} = $repl_conf->check_for_existing_jobs($vmid, 1);
 
+    if ($self->{replication_jobcfg} && defined($self->{replication_jobcfg}->{remove_job})) {
+	die "refusing to migrate replicated VM whose replication job is marked for removal\n";
+    }
+
     PVE::QemuConfig->check_lock($conf);
 
     my $running = 0;
-- 
2.20.1





^ permalink raw reply	[flat|nested] 10+ messages in thread

* [pve-devel] applied-series: [PATCH-SERIES v2] some replication-related improvements
  2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
                   ` (7 preceding siblings ...)
  2020-10-29 13:31 ` [pve-devel] [PATCH/RFC v2 qemu-server 8/8] don't migrate replicated VM whose replication job is marked for removal Fabian Ebner
@ 2020-11-09  9:48 ` Fabian Grünbichler
  8 siblings, 0 replies; 10+ messages in thread
From: Fabian Grünbichler @ 2020-11-09  9:48 UTC (permalink / raw)
  To: Proxmox VE development discussion

On October 29, 2020 2:31 pm, Fabian Ebner wrote:
> Is a second version for the two older series:
> https://lists.proxmox.com/pipermail/pve-devel/2020-July/044550.html
> https://lists.proxmox.com/pipermail/pve-devel/2020-August/044589.html
> 
> While the qemu-server patches don't require the guest-common patches,
> a change to patch #7 is only relevant with the new switch_replication_target.
> 
> Changes from v1:
>     * Dropped an already applied patch.
>     * Dropped the patch: 'Hold the guest migration lock when changing the replication config'
>       Being able to edit a job's config while it's running is not a bad thing
>       as long as it doesn't cause problems. And the fact that run_replication
>       acquires the guest migration lock should ensure that. It guarantees that
>       a job cannot be removed while a migration is running (only *marked* for
>       removal and with patch 8 we chicken out in that case) and a new job
>       cannot run while a migration is running, so:
>           * if it's added before the migration reads the replication
>             config and the replication target is the migration target, the
>             migration itself might end up running the first replication, but
>             that's fine.
>           * otherwise it will run for the first time after migration is done.
>     * Dropped an RFC making job_status return jobs with source=target=local.
>       The replication tests would need to be adapted, and there might be
>       a better way than have job_status return them.
>     * Added patch #8 for checking if the replication job to be used for
>       migration is scheduled for removal.
>     * Split up a patch, now patch #4 + #5.
>     * Now that the source is updated as well, call switch_replication_job_target
>       for every replicated VM (even if we migrated to a non-replication target).
>       Previously that happened only in the was-VM-stolen-check in job_status
>       (still happens there for VMs that were actually stolen).
> 
> 
> guest-common:
> 
> Fabian Ebner (5):
>   job_status: read only after acquiring the lock
>   clarify what the source property is used for in a replication job
>   also update sources in switch_replication_job_target
>   create nolock variant for switch_replication_job_target
>   job_status: simplify fixup of jobs for stolen guests
> 
>  PVE/ReplicationConfig.pm | 46 +++++++++++++++++-----------------------
>  PVE/ReplicationState.pm  | 42 +++++++++++++++---------------------
>  2 files changed, 37 insertions(+), 51 deletions(-)
> 
> 
> qemu-server:
> 
> Fabian Ebner (3):
>   Repeat check for replication target in locked section
>   fix checks for transfering replication state/switching job target
>   don't migrate replicated VM whose replication job is marked for
>     removal
> 
>  PVE/API2/Qemu.pm   | 11 +++--------
>  PVE/QemuMigrate.pm | 30 +++++++++++++++++++++++-------
>  2 files changed, 26 insertions(+), 15 deletions(-)
> 
> -- 
> 2.20.1
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-11-09  9:49 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29 13:31 [pve-devel] [PATCH-SERIES v2] some replication-related improvements Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 1/8] job_status: read only after acquiring the lock Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 2/8] clarify what the source property is used for in a replication job Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 3/8] also update sources in switch_replication_job_target Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 4/8] create nolock variant for switch_replication_job_target Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH v2 guest-common 5/8] job_status: simplify fixup of jobs for stolen guests Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH v2 qemu-server 6/8] Repeat check for replication target in locked section Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH v2 qemu-server 7/8] fix checks for transfering replication state/switching job target Fabian Ebner
2020-10-29 13:31 ` [pve-devel] [PATCH/RFC v2 qemu-server 8/8] don't migrate replicated VM whose replication job is marked for removal Fabian Ebner
2020-11-09  9:48 ` [pve-devel] applied-series: [PATCH-SERIES v2] some replication-related improvements Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal