* [pve-devel] [PATCH v2 guest-common 1/2] partially fix 3111: snapshot rollback: improve removing replication snapshots @ 2021-06-09 9:18 Fabian Ebner 2021-06-09 9:18 ` [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate guest on rollback if there are replication jobs for it Fabian Ebner 0 siblings, 1 reply; 4+ messages in thread From: Fabian Ebner @ 2021-06-09 9:18 UTC (permalink / raw) To: pve-devel Get the replicatable volumes from the snapshot config rather than the current config. And filter those volumes further to those that will actually be rolled back. Previously, a volume that only had replication snapshots (e.g. because it was added after the snapshot was taken, or the vmstate volume) would lose them. Then, on the next replication run, such a volume would lead to an error, because replication tried to do a full sync, but the target volume still exists. Should be enough for most real-world scenarios, but not a complete fix: It is still possible to run into the problem by removing the last (non-replication) snapshots after a rollback before replication can run once. The list of volumes is not required to be sorted for prepare(), but it is sorted by how foreach_volume() iterates now, so not random. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> --- Changes from v1: * dropped already applied patch * rebased on top of the new filename (since a src/ prefix was added) * add another comment mentioning why the additional filtering is necessary src/PVE/AbstractConfig.pm | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm index 3348d8a..6542ae4 100644 --- a/src/PVE/AbstractConfig.pm +++ b/src/PVE/AbstractConfig.pm @@ -974,13 +974,23 @@ sub snapshot_rollback { if ($prepare) { my $repl_conf = PVE::ReplicationConfig->new(); if ($repl_conf->check_for_existing_jobs($vmid, 1)) { - # remove all replication snapshots - my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $conf, 1); - my $sorted_volids = [ sort keys %$volumes ]; + # remove replication snapshots on volumes affected by rollback *only*! + my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $snap, 1); + + # filter by what we actually iterate over below (excludes vmstate!) + my $volids = []; + $class->foreach_volume($snap, sub { + my ($vs, $volume) = @_; + + my $volid_key = $class->volid_key(); + my $volid = $volume->{$volid_key}; + + push @{$volids}, $volid if $volumes->{$volid}; + }); # remove all local replication snapshots (jobid => undef) my $logfunc = sub { my $line = shift; chomp $line; print "$line\n"; }; - PVE::Replication::prepare($storecfg, $sorted_volids, undef, 1, undef, $logfunc); + PVE::Replication::prepare($storecfg, $volids, undef, 1, undef, $logfunc); } $class->foreach_volume($snap, sub { -- 2.30.2 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate guest on rollback if there are replication jobs for it 2021-06-09 9:18 [pve-devel] [PATCH v2 guest-common 1/2] partially fix 3111: snapshot rollback: improve removing replication snapshots Fabian Ebner @ 2021-06-09 9:18 ` Fabian Ebner 2021-06-22 7:41 ` Fabian Grünbichler 0 siblings, 1 reply; 4+ messages in thread From: Fabian Ebner @ 2021-06-09 9:18 UTC (permalink / raw) To: pve-devel so that there will be a valid replication snapshot again. Otherwise, replication will be broken after a rollback if the last (non-replication) snapshot is removed before replication can run again. Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> --- No changes from v1 (except rebase). Not a huge fan of this, but the alternatives I could come up with don't seem much better IMHO: 1. Invalidate/remove replicated volumes after a rollback altogether and require a full sync on the next replication job afterwards. 2. Another one is to disallow removing the last non-replication snapshot if: * there is a replication job configured * no replication snapshot for that job currently exists (which likely means it was removed by a previous rollback operation, but can also happen for a new job that didn't run yet). 3. Hope not very many people immediately delete their snapshots after rollback. Pick a favorite or suggest your own ;) src/PVE/AbstractConfig.pm | 19 +++++++++++++++++-- src/PVE/ReplicationConfig.pm | 14 ++++++++++++++ 2 files changed, 31 insertions(+), 2 deletions(-) diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm index 6542ae4..6cc0537 100644 --- a/src/PVE/AbstractConfig.pm +++ b/src/PVE/AbstractConfig.pm @@ -951,6 +951,9 @@ sub snapshot_rollback { my $storecfg = PVE::Storage::config(); + my $repl_conf = PVE::ReplicationConfig->new(); + my $logfunc = sub { my $line = shift; chomp $line; print "$line\n"; }; + my $data = {}; my $get_snapshot_config = sub { @@ -972,7 +975,6 @@ sub snapshot_rollback { $snap = $get_snapshot_config->($conf); if ($prepare) { - my $repl_conf = PVE::ReplicationConfig->new(); if ($repl_conf->check_for_existing_jobs($vmid, 1)) { # remove replication snapshots on volumes affected by rollback *only*! my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $snap, 1); @@ -989,7 +991,6 @@ sub snapshot_rollback { }); # remove all local replication snapshots (jobid => undef) - my $logfunc = sub { my $line = shift; chomp $line; print "$line\n"; }; PVE::Replication::prepare($storecfg, $volids, undef, 1, undef, $logfunc); } @@ -1047,6 +1048,20 @@ sub snapshot_rollback { $prepare = 0; $class->lock_config($vmid, $updatefn); + + my $replication_jobs = $repl_conf->list_guests_replication_jobs($vmid); + for my $job (@{$replication_jobs}) { + my $target = $job->{target}; + $logfunc->("replicating rolled back guest to node '$target'"); + + my $start_time = time(); + eval { + PVE::Replication::run_replication($class, $job, $start_time, $start_time, $logfunc); + }; + if (my $err = $@) { + warn "unable to replicate rolled back guest to node '$target' - $err"; + } + } } # bash completion helper diff --git a/src/PVE/ReplicationConfig.pm b/src/PVE/ReplicationConfig.pm index fd856a0..84a718f 100644 --- a/src/PVE/ReplicationConfig.pm +++ b/src/PVE/ReplicationConfig.pm @@ -228,6 +228,20 @@ sub find_local_replication_job { return undef; } +sub list_guests_replication_jobs { + my ($cfg, $vmid) = @_; + + my $jobs = []; + + for my $job (values %{$cfg->{ids}}) { + next if $job->{type} ne 'local' || $job->{guest} != $vmid; + + push @{$jobs}, $job; + } + + return $jobs; +} + # makes old_target the new source for all local jobs of this guest # makes new_target the target for the single local job with target old_target sub switch_replication_job_target_nolock { -- 2.30.2 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate guest on rollback if there are replication jobs for it 2021-06-09 9:18 ` [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate guest on rollback if there are replication jobs for it Fabian Ebner @ 2021-06-22 7:41 ` Fabian Grünbichler 2021-08-12 9:12 ` Fabian Ebner 0 siblings, 1 reply; 4+ messages in thread From: Fabian Grünbichler @ 2021-06-22 7:41 UTC (permalink / raw) To: Proxmox VE development discussion On June 9, 2021 11:18 am, Fabian Ebner wrote: > so that there will be a valid replication snapshot again. > > Otherwise, replication will be broken after a rollback if the last > (non-replication) snapshot is removed before replication can run again. I still see issues with these two patches applied.. A: snapshot 'test2', successful replication afterwards, then rollback to 'test2': delete stale replication snapshot '__replicate_102-1_1624346101__' on local-zfs:vm-102-disk-0 delete stale replication snapshot '__replicate_102-0_1624346407__' on local-zfs:vm-102-disk-0 replicating rolled back guest to node 'pve64test02' start replication job guest => VM 102, running => 0 volumes => local-zfs:vm-102-disk-0 create snapshot '__replicate_102-0_1624346600__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none incremental sync 'local-zfs:vm-102-disk-0' (test2 => __replicate_102-0_1624346600__) send from @test2 to rpool/data/vm-102-disk-0@__replicate_102-0_1624346600__ estimated size is 624B total estimated size is 624B successfully imported 'local-zfs:vm-102-disk-0' delete previous replication snapshot '__replicate_102-0_1624346407__' on local-zfs:vm-102-disk-0 zfs error: could not find any snapshots to destroy; check snapshot names. end replication job replicating rolled back guest to node 'pve64test03' start replication job guest => VM 102, running => 0 volumes => local-zfs:vm-102-disk-0 create snapshot '__replicate_102-1_1624346602__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none incremental sync 'local-zfs:vm-102-disk-0' (test2 => __replicate_102-1_1624346602__) send from @test2 to rpool/data/vm-102-disk-0@__replicate_102-0_1624346600__ estimated size is 624B send from @__replicate_102-0_1624346600__ to rpool/data/vm-102-disk-0@__replicate_102-1_1624346602__ estimated size is 624B total estimated size is 1.22K successfully imported 'local-zfs:vm-102-disk-0' delete previous replication snapshot '__replicate_102-1_1624346101__' on local-zfs:vm-102-disk-0 zfs error: could not find any snapshots to destroy; check snapshot names. end replication job two misleading errors from attempting to delete already-cleaned up snapshots (this is just confusing, likely caused by the state file being outdated after prepare(), replication is still working as expected afterwards) B: successful replication, snapshot 'test3', rollback to 'test3' delete stale replication snapshot '__replicate_102-1_1624346602__' on local-zfs:vm-102-disk-0 delete stale replication snapshot '__replicate_102-0_1624346600__' on local-zfs:vm-102-disk-0 replicating rolled back guest to node 'pve64test02' start replication job guest => VM 102, running => 0 volumes => local-zfs:vm-102-disk-0 create snapshot '__replicate_102-0_1624346683__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none full sync 'local-zfs:vm-102-disk-0' (__replicate_102-0_1624346683__) full send of rpool/data/vm-102-disk-0@test estimated size is 5.59G send from @test to rpool/data/vm-102-disk-0@test2 estimated size is 624B send from @test2 to rpool/data/vm-102-disk-0@test3 estimated size is 624B send from @test3 to rpool/data/vm-102-disk-0@__replicate_102-0_1624346683__ estimated size is 624B total estimated size is 5.59G volume 'rpool/data/vm-102-disk-0' already exists command 'zfs send -Rpv -- rpool/data/vm-102-disk-0@__replicate_102-0_1624346683__' failed: got signal 13 send/receive failed, cleaning up snapshot(s).. delete previous replication snapshot '__replicate_102-0_1624346683__' on local-zfs:vm-102-disk-0 end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_102-0_1624346683__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve64test02' root@10.0.0.172 -- pvesm import local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: exit code 255 unable to replicate rolled back guest to node 'pve64test02' - command 'set -o pipefail && pvesm export local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_102-0_1624346683__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve64test02' root@10.0.0.172 -- pvesm import local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: exit code 255 replicating rolled back guest to node 'pve64test03' start replication job guest => VM 102, running => 0 volumes => local-zfs:vm-102-disk-0 create snapshot '__replicate_102-1_1624346685__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none full sync 'local-zfs:vm-102-disk-0' (__replicate_102-1_1624346685__) full send of rpool/data/vm-102-disk-0@test estimated size is 5.59G send from @test to rpool/data/vm-102-disk-0@test2 estimated size is 624B send from @test2 to rpool/data/vm-102-disk-0@test3 estimated size is 624B send from @test3 to rpool/data/vm-102-disk-0@__replicate_102-1_1624346685__ estimated size is 624B total estimated size is 5.59G volume 'rpool/data/vm-102-disk-0' already exists command 'zfs send -Rpv -- rpool/data/vm-102-disk-0@__replicate_102-1_1624346685__' failed: got signal 13 send/receive failed, cleaning up snapshot(s).. delete previous replication snapshot '__replicate_102-1_1624346685__' on local-zfs:vm-102-disk-0 end replication job with error: command 'set -o pipefail && pvesm export local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_102-1_1624346685__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve64test03' root@10.0.0.173 -- pvesm import local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: exit code 255 unable to replicate rolled back guest to node 'pve64test03' - command 'set -o pipefail && pvesm export local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_102-1_1624346685__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve64test03' root@10.0.0.173 -- pvesm import local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: exit code 255 now the replication is broken and requires manual intervention to be fixed: source: no replication snapshots, regular snapshots test, test2, test3 target: one replication snapshot, regular snapshots test, test2 (test3 is missing) > > Signed-off-by: Fabian Ebner <f.ebner@proxmox.com> > --- > > No changes from v1 (except rebase). > > Not a huge fan of this, but the alternatives I could come up with don't seem > much better IMHO: > > 1. Invalidate/remove replicated volumes after a rollback altogether and require > a full sync on the next replication job afterwards. > > 2. Another one is to disallow removing the last non-replication snapshot if: > * there is a replication job configured > * no replication snapshot for that job currently exists > (which likely means it was removed by a previous rollback operation, but > can also happen for a new job that didn't run yet). > > 3. Hope not very many people immediately delete their snapshots after rollback. > > Pick a favorite or suggest your own ;) > > src/PVE/AbstractConfig.pm | 19 +++++++++++++++++-- > src/PVE/ReplicationConfig.pm | 14 ++++++++++++++ > 2 files changed, 31 insertions(+), 2 deletions(-) > > diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm > index 6542ae4..6cc0537 100644 > --- a/src/PVE/AbstractConfig.pm > +++ b/src/PVE/AbstractConfig.pm > @@ -951,6 +951,9 @@ sub snapshot_rollback { > > my $storecfg = PVE::Storage::config(); > > + my $repl_conf = PVE::ReplicationConfig->new(); > + my $logfunc = sub { my $line = shift; chomp $line; print "$line\n"; }; > + > my $data = {}; > > my $get_snapshot_config = sub { > @@ -972,7 +975,6 @@ sub snapshot_rollback { > $snap = $get_snapshot_config->($conf); > > if ($prepare) { > - my $repl_conf = PVE::ReplicationConfig->new(); > if ($repl_conf->check_for_existing_jobs($vmid, 1)) { > # remove replication snapshots on volumes affected by rollback *only*! > my $volumes = $class->get_replicatable_volumes($storecfg, $vmid, $snap, 1); > @@ -989,7 +991,6 @@ sub snapshot_rollback { > }); > > # remove all local replication snapshots (jobid => undef) > - my $logfunc = sub { my $line = shift; chomp $line; print "$line\n"; }; > PVE::Replication::prepare($storecfg, $volids, undef, 1, undef, $logfunc); I) maybe prepare or at least this calling context should hold the replication lock so that it can't race with concurrent replication runs? II) maybe prepare should update the state file in general (if the last snapshot referenced there is stale and gets removed, the next replication run gets confused) - might/should fix A III) maybe prepare and/or run_replication need to learn how to deal with "only regular snapshots, but not the last one match" (we could match on name+guid to ensure the common base is actually a common, previously replicated base/the same snapshot and not just the same name with different content) and construct a send stream from that shared snapshot instead of attempting a full sync where the target already (partially) exists.. that would fix B, and improve replication robustness in general (allowing resuming from a partial shared state instead of having to remove, start over from scratch..) implementing III would also avoid the need for doing a replication after rollback - the next replication would handle the situation just fine unless ALL previously shared snapshots are removed - we could check for that in the remove snapshot code path though.. or we could just schedule a replication here instead of directly doing it. rollback is an almost instant action (as long as no vmstate is involved), and a replication can take a long time so it seems a bit strange to conflate the two.. > } > > @@ -1047,6 +1048,20 @@ sub snapshot_rollback { > > $prepare = 0; > $class->lock_config($vmid, $updatefn); > + > + my $replication_jobs = $repl_conf->list_guests_replication_jobs($vmid); > + for my $job (@{$replication_jobs}) { > + my $target = $job->{target}; > + $logfunc->("replicating rolled back guest to node '$target'"); > + > + my $start_time = time(); > + eval { > + PVE::Replication::run_replication($class, $job, $start_time, $start_time, $logfunc); > + }; > + if (my $err = $@) { > + warn "unable to replicate rolled back guest to node '$target' - $err"; > + } > + } > } > > # bash completion helper > diff --git a/src/PVE/ReplicationConfig.pm b/src/PVE/ReplicationConfig.pm > index fd856a0..84a718f 100644 > --- a/src/PVE/ReplicationConfig.pm > +++ b/src/PVE/ReplicationConfig.pm > @@ -228,6 +228,20 @@ sub find_local_replication_job { > return undef; > } > > +sub list_guests_replication_jobs { > + my ($cfg, $vmid) = @_; > + > + my $jobs = []; > + > + for my $job (values %{$cfg->{ids}}) { > + next if $job->{type} ne 'local' || $job->{guest} != $vmid; > + > + push @{$jobs}, $job; > + } > + > + return $jobs; > +} > + > # makes old_target the new source for all local jobs of this guest > # makes new_target the target for the single local job with target old_target > sub switch_replication_job_target_nolock { > -- > 2.30.2 > > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate guest on rollback if there are replication jobs for it 2021-06-22 7:41 ` Fabian Grünbichler @ 2021-08-12 9:12 ` Fabian Ebner 0 siblings, 0 replies; 4+ messages in thread From: Fabian Ebner @ 2021-08-12 9:12 UTC (permalink / raw) To: pve-devel, Fabian Grünbichler I'll likely send the next version of the series today, but wanted to address some points from here first (so I don't have to quote everything there). Am 22.06.21 um 09:41 schrieb Fabian Grünbichler: > On June 9, 2021 11:18 am, Fabian Ebner wrote: >> so that there will be a valid replication snapshot again. >> >> Otherwise, replication will be broken after a rollback if the last >> (non-replication) snapshot is removed before replication can run again. > > I still see issues with these two patches applied.. > > A: snapshot 'test2', successful replication afterwards, then rollback to > 'test2': > ----snip---- > zfs error: could not find any snapshots to destroy; check snapshot names. > end replication job > > two misleading errors from attempting to delete already-cleaned up > snapshots (this is just confusing, likely caused by the state file being > outdated after prepare(), replication is still working as expected > afterwards) Those errors are triggered by $cleanup_local_snapshots in Replication.pm's replicate() and there currently is no easy way to know if we are in an after-rollback situation (or other situation where snapshots might already be deleted) there. We could just ignore this specific error, but then we won't detect if an actually wrong snapshot name was passed in anymore. > > B: successful replication, snapshot 'test3', rollback to 'test3' > ----snip---- > > now the replication is broken and requires manual intervention to be > fixed: > > source: no replication snapshots, regular snapshots test, test2, test3 > > target: one replication snapshot, regular snapshots test, test2 (test3 > is missing) > Will be fixed in the next version of the series. ----snip---- > > I) maybe prepare or at least this calling context should hold the > replication lock so that it can't race with concurrent replication runs? Ideally, all snapshot operations would need to hold the lock, or? Otherwise, it might happen that some volumes are replicated before the snapshot operation was done with them, and some after. I'll look at that in detail some time and send it as its own series. > II) maybe prepare should update the state file in general (if the last > snapshot referenced there is stale and gets removed, the next > replication run gets confused) - might/should fix A The problem is that the state is not aware of the individual volumes and rollback might not remove replication snapshots from all replicated volumes. It does currently, but that's wrong and causing this bug in the first place. > III) maybe prepare and/or run_replication need to learn how to deal with > "only regular snapshots, but not the last one match" (we could match on > name+guid to ensure the common base is actually a common, previously > replicated base/the same snapshot and not just the same name with > different content) and construct a send stream from that shared snapshot > instead of attempting a full sync where the target already (partially) > exists.. that would fix B, and improve replication robustness in > general (allowing resuming from a partial shared state instead of having > to remove, start over from scratch..) > > implementing III would also avoid the need for doing a replication after > rollback - the next replication would handle the situation just fine > unless ALL previously shared snapshots are removed - we could check for > that in the remove snapshot code path though.. or we could just schedule > a replication here instead of directly doing it. rollback is an almost > instant action (as long as no vmstate is involved), and a replication > can take a long time so it seems a bit strange to conflate the two.. > I went with a similar approach as also discussed off-list, sans the guid matching as that's not really possible to get from the current volume_snapshot_list while being backwards compatible. And I'm not sure it's even possible to trigger a mismatch with the new approach, because of the "prevent removal after rollback until replication is run again" restriction. ----snip---- ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-08-12 9:13 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-09 9:18 [pve-devel] [PATCH v2 guest-common 1/2] partially fix 3111: snapshot rollback: improve removing replication snapshots Fabian Ebner 2021-06-09 9:18 ` [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate guest on rollback if there are replication jobs for it Fabian Ebner 2021-06-22 7:41 ` Fabian Grünbichler 2021-08-12 9:12 ` Fabian Ebner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox