From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 402367485D for ; Tue, 22 Jun 2021 09:42:15 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 2A4932342B for ; Tue, 22 Jun 2021 09:41:45 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id AEF8423420 for ; Tue, 22 Jun 2021 09:41:43 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 7F1C34302A for ; Tue, 22 Jun 2021 09:41:43 +0200 (CEST) Date: Tue, 22 Jun 2021 09:41:35 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox VE development discussion References: <20210609091858.27219-1-f.ebner@proxmox.com> <20210609091858.27219-2-f.ebner@proxmox.com> In-Reply-To: <20210609091858.27219-2-f.ebner@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.15.0 (https://github.com/astroidmail/astroid) Message-Id: <1624346775.lqzf5lnz7c.astroid@nora.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.692 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [abstractconfig.pm, replicationconfig.pm, proxmox.com] Subject: Re: [pve-devel] [PATCH v2 guest-common 2/2] fix 3111: replicate guest on rollback if there are replication jobs for it X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2021 07:42:15 -0000 On June 9, 2021 11:18 am, Fabian Ebner wrote: > so that there will be a valid replication snapshot again. >=20 > Otherwise, replication will be broken after a rollback if the last > (non-replication) snapshot is removed before replication can run again. I still see issues with these two patches applied.. A: snapshot 'test2', successful replication afterwards, then rollback to=20 'test2': delete stale replication snapshot '__replicate_102-1_1624346101__' on local= -zfs:vm-102-disk-0 delete stale replication snapshot '__replicate_102-0_1624346407__' on local= -zfs:vm-102-disk-0 replicating rolled back guest to node 'pve64test02' start replication job guest =3D> VM 102, running =3D> 0 volumes =3D> local-zfs:vm-102-disk-0 create snapshot '__replicate_102-0_1624346600__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none incremental sync 'local-zfs:vm-102-disk-0' (test2 =3D> __replicate_102-0_16= 24346600__) send from @test2 to rpool/data/vm-102-disk-0@__replicate_102-0_1624346600__= estimated size is 624B total estimated size is 624B successfully imported 'local-zfs:vm-102-disk-0' delete previous replication snapshot '__replicate_102-0_1624346407__' on lo= cal-zfs:vm-102-disk-0 zfs error: could not find any snapshots to destroy; check snapshot names. end replication job replicating rolled back guest to node 'pve64test03' start replication job guest =3D> VM 102, running =3D> 0 volumes =3D> local-zfs:vm-102-disk-0 create snapshot '__replicate_102-1_1624346602__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none incremental sync 'local-zfs:vm-102-disk-0' (test2 =3D> __replicate_102-1_16= 24346602__) send from @test2 to rpool/data/vm-102-disk-0@__replicate_102-0_1624346600__= estimated size is 624B send from @__replicate_102-0_1624346600__ to rpool/data/vm-102-disk-0@__rep= licate_102-1_1624346602__ estimated size is 624B total estimated size is 1.22K successfully imported 'local-zfs:vm-102-disk-0' delete previous replication snapshot '__replicate_102-1_1624346101__' on lo= cal-zfs:vm-102-disk-0 zfs error: could not find any snapshots to destroy; check snapshot names. end replication job two misleading errors from attempting to delete already-cleaned up=20 snapshots (this is just confusing, likely caused by the state file being=20 outdated after prepare(), replication is still working as expected=20 afterwards) B: successful replication, snapshot 'test3', rollback to 'test3' delete stale replication snapshot '__replicate_102-1_1624346602__' on local= -zfs:vm-102-disk-0 delete stale replication snapshot '__replicate_102-0_1624346600__' on local= -zfs:vm-102-disk-0 replicating rolled back guest to node 'pve64test02' start replication job guest =3D> VM 102, running =3D> 0 volumes =3D> local-zfs:vm-102-disk-0 create snapshot '__replicate_102-0_1624346683__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none full sync 'local-zfs:vm-102-disk-0' (__replicate_102-0_1624346683__) full send of rpool/data/vm-102-disk-0@test estimated size is 5.59G send from @test to rpool/data/vm-102-disk-0@test2 estimated size is 624B send from @test2 to rpool/data/vm-102-disk-0@test3 estimated size is 624B send from @test3 to rpool/data/vm-102-disk-0@__replicate_102-0_1624346683__= estimated size is 624B total estimated size is 5.59G volume 'rpool/data/vm-102-disk-0' already exists command 'zfs send -Rpv -- rpool/data/vm-102-disk-0@__replicate_102-0_162434= 6683__' failed: got signal 13 send/receive failed, cleaning up snapshot(s).. delete previous replication snapshot '__replicate_102-0_1624346683__' on lo= cal-zfs:vm-102-disk-0 end replication job with error: command 'set -o pipefail && pvesm export lo= cal-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_102-0_1= 624346683__ | /usr/bin/ssh -e none -o 'BatchMode=3Dyes' -o 'HostKeyAlias=3D= pve64test02' root@10.0.0.172 -- pvesm import local-zfs:vm-102-disk-0 zfs - = -with-snapshots 1 -allow-rename 0' failed: exit code 255 unable to replicate rolled back guest to node 'pve64test02' - command 'set = -o pipefail && pvesm export local-zfs:vm-102-disk-0 zfs - -with-snapshots 1= -snapshot __replicate_102-0_1624346683__ | /usr/bin/ssh -e none -o 'BatchM= ode=3Dyes' -o 'HostKeyAlias=3Dpve64test02' root@10.0.0.172 -- pvesm import = local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: ex= it code 255 replicating rolled back guest to node 'pve64test03' start replication job guest =3D> VM 102, running =3D> 0 volumes =3D> local-zfs:vm-102-disk-0 create snapshot '__replicate_102-1_1624346685__' on local-zfs:vm-102-disk-0 using secure transmission, rate limit: none full sync 'local-zfs:vm-102-disk-0' (__replicate_102-1_1624346685__) full send of rpool/data/vm-102-disk-0@test estimated size is 5.59G send from @test to rpool/data/vm-102-disk-0@test2 estimated size is 624B send from @test2 to rpool/data/vm-102-disk-0@test3 estimated size is 624B send from @test3 to rpool/data/vm-102-disk-0@__replicate_102-1_1624346685__= estimated size is 624B total estimated size is 5.59G volume 'rpool/data/vm-102-disk-0' already exists command 'zfs send -Rpv -- rpool/data/vm-102-disk-0@__replicate_102-1_162434= 6685__' failed: got signal 13 send/receive failed, cleaning up snapshot(s).. delete previous replication snapshot '__replicate_102-1_1624346685__' on lo= cal-zfs:vm-102-disk-0 end replication job with error: command 'set -o pipefail && pvesm export lo= cal-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -snapshot __replicate_102-1_1= 624346685__ | /usr/bin/ssh -e none -o 'BatchMode=3Dyes' -o 'HostKeyAlias=3D= pve64test03' root@10.0.0.173 -- pvesm import local-zfs:vm-102-disk-0 zfs - = -with-snapshots 1 -allow-rename 0' failed: exit code 255 unable to replicate rolled back guest to node 'pve64test03' - command 'set = -o pipefail && pvesm export local-zfs:vm-102-disk-0 zfs - -with-snapshots 1= -snapshot __replicate_102-1_1624346685__ | /usr/bin/ssh -e none -o 'BatchM= ode=3Dyes' -o 'HostKeyAlias=3Dpve64test03' root@10.0.0.173 -- pvesm import = local-zfs:vm-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' failed: ex= it code 255 now the replication is broken and requires manual intervention to be=20 fixed: source: no replication snapshots, regular snapshots test, test2, test3 target: one replication snapshot, regular snapshots test, test2 (test3=20 is missing) >=20 > Signed-off-by: Fabian Ebner > --- >=20 > No changes from v1 (except rebase). >=20 > Not a huge fan of this, but the alternatives I could come up with don't s= eem > much better IMHO: >=20 > 1. Invalidate/remove replicated volumes after a rollback altogether and r= equire > a full sync on the next replication job afterwards. >=20 > 2. Another one is to disallow removing the last non-replication snapshot = if: > * there is a replication job configured > * no replication snapshot for that job currently exists > (which likely means it was removed by a previous rollback operation= , but > can also happen for a new job that didn't run yet). >=20 > 3. Hope not very many people immediately delete their snapshots after rol= lback. >=20 > Pick a favorite or suggest your own ;) >=20 > src/PVE/AbstractConfig.pm | 19 +++++++++++++++++-- > src/PVE/ReplicationConfig.pm | 14 ++++++++++++++ > 2 files changed, 31 insertions(+), 2 deletions(-) >=20 > diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm > index 6542ae4..6cc0537 100644 > --- a/src/PVE/AbstractConfig.pm > +++ b/src/PVE/AbstractConfig.pm > @@ -951,6 +951,9 @@ sub snapshot_rollback { > =20 > my $storecfg =3D PVE::Storage::config(); > =20 > + my $repl_conf =3D PVE::ReplicationConfig->new(); > + my $logfunc =3D sub { my $line =3D shift; chomp $line; print "$line\= n"; }; > + > my $data =3D {}; > =20 > my $get_snapshot_config =3D sub { > @@ -972,7 +975,6 @@ sub snapshot_rollback { > $snap =3D $get_snapshot_config->($conf); > =20 > if ($prepare) { > - my $repl_conf =3D PVE::ReplicationConfig->new(); > if ($repl_conf->check_for_existing_jobs($vmid, 1)) { > # remove replication snapshots on volumes affected by rollback *only*! > my $volumes =3D $class->get_replicatable_volumes($storecfg, $vmid, $sn= ap, 1); > @@ -989,7 +991,6 @@ sub snapshot_rollback { > }); > =20 > # remove all local replication snapshots (jobid =3D> undef) > - my $logfunc =3D sub { my $line =3D shift; chomp $line; print "$line\n"= ; }; > PVE::Replication::prepare($storecfg, $volids, undef, 1, undef, $logfun= c); I) maybe prepare or at least this calling context should hold the=20 replication lock so that it can't race with concurrent replication runs? II) maybe prepare should update the state file in general (if the last=20 snapshot referenced there is stale and gets removed, the next=20 replication run gets confused) - might/should fix A III) maybe prepare and/or run_replication need to learn how to deal with=20 "only regular snapshots, but not the last one match" (we could match on name+guid to ensure the common base is actually a common, previously=20 replicated base/the same snapshot and not just the same name with=20 different content) and construct a send stream from that shared snapshot=20 instead of attempting a full sync where the target already (partially)=20 exists.. that would fix B, and improve replication robustness in=20 general (allowing resuming from a partial shared state instead of having=20 to remove, start over from scratch..) implementing III would also avoid the need for doing a replication after=20 rollback - the next replication would handle the situation just fine=20 unless ALL previously shared snapshots are removed - we could check for=20 that in the remove snapshot code path though.. or we could just schedule=20 a replication here instead of directly doing it. rollback is an almost=20 instant action (as long as no vmstate is involved), and a replication=20 can take a long time so it seems a bit strange to conflate the two.. > } > =20 > @@ -1047,6 +1048,20 @@ sub snapshot_rollback { > =20 > $prepare =3D 0; > $class->lock_config($vmid, $updatefn); > + > + my $replication_jobs =3D $repl_conf->list_guests_replication_jobs($v= mid); > + for my $job (@{$replication_jobs}) { > + my $target =3D $job->{target}; > + $logfunc->("replicating rolled back guest to node '$target'"); > + > + my $start_time =3D time(); > + eval { > + PVE::Replication::run_replication($class, $job, $start_time, $start= _time, $logfunc); > + }; > + if (my $err =3D $@) { > + warn "unable to replicate rolled back guest to node '$target' - $er= r"; > + } > + } > } > =20 > # bash completion helper > diff --git a/src/PVE/ReplicationConfig.pm b/src/PVE/ReplicationConfig.pm > index fd856a0..84a718f 100644 > --- a/src/PVE/ReplicationConfig.pm > +++ b/src/PVE/ReplicationConfig.pm > @@ -228,6 +228,20 @@ sub find_local_replication_job { > return undef; > } > =20 > +sub list_guests_replication_jobs { > + my ($cfg, $vmid) =3D @_; > + > + my $jobs =3D []; > + > + for my $job (values %{$cfg->{ids}}) { > + next if $job->{type} ne 'local' || $job->{guest} !=3D $vmid; > + > + push @{$jobs}, $job; > + } > + > + return $jobs; > +} > + > # makes old_target the new source for all local jobs of this guest > # makes new_target the target for the single local job with target old_t= arget > sub switch_replication_job_target_nolock { > --=20 > 2.30.2 >=20 >=20 >=20 > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel >=20 >=20 >=20