From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Fabian Ebner <f.ebner@proxmox.com>, pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH manager 1/3] Hold the guest migration lock when changing the replication config
Date: Mon, 03 Aug 2020 09:49:41 +0200 [thread overview]
Message-ID: <1596440280.7cx8h2dp61.astroid@nora.none> (raw)
In-Reply-To: <20200730112915.15402-1-f.ebner@proxmox.com>
On July 30, 2020 1:29 pm, Fabian Ebner wrote:
> The guest migration lock is already held when running replications,
> but it also makes sense to hold it when updating the replication
> config itself. Otherwise, it can happen that the migration does
> not know the de-facto state of replication.
>
> For example:
> 1. migration starts
> 2. replication job is deleted
> 3. migration reads the replication config
> 4. migration runs the replication which causes the
> replicated disks to be removed, because the job
> is marked for removal
> 5. migration will continue without replication
>
> Note that the migration doesn't actually fail, but it's probably
> not the desired behavior either.
>
> Suggested-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
> ---
> PVE/API2/ReplicationConfig.pm | 18 ++++++++++++++----
> 1 file changed, 14 insertions(+), 4 deletions(-)
>
> diff --git a/PVE/API2/ReplicationConfig.pm b/PVE/API2/ReplicationConfig.pm
> index 2b4ecd10..e5262068 100644
> --- a/PVE/API2/ReplicationConfig.pm
> +++ b/PVE/API2/ReplicationConfig.pm
> @@ -9,6 +9,7 @@ use PVE::JSONSchema qw(get_standard_option);
> use PVE::RPCEnvironment;
> use PVE::ReplicationConfig;
> use PVE::Cluster;
> +use PVE::GuestHelpers;
>
> use PVE::RESTHandler;
>
> @@ -144,7 +145,9 @@ __PACKAGE__->register_method ({
> $cfg->write();
> };
>
> - PVE::ReplicationConfig::lock($code);
> + PVE::GuestHelpers::guest_migration_lock($guest, 10, sub {
> + PVE::ReplicationConfig::lock($code);
> + });
it might make sense to have a single wrapper for this, or add the guest
ID as parameter to ReplicationConfig::lock (to not miss it or get the
order wrong).
what about the calls to lock within ReplicationConfig? they are all
job/guest ID specific, and should also get this additional protection,
right?
from a quick glance, there seems to be only a single call to
ReplicationConfig::lock that spans more than one job (job_status in
ReplicationState), but that immediately iterates over jobs, so we could
either move the lock into the loop (expensive, since it involves a
cfs_lock), or split the cfs and flock just for this instance?
(side note, that code and possibly other stuff in ReplicationConfig is
buggy since it does not re-read the config after locking)
>
> return undef;
> }});
> @@ -167,6 +170,7 @@ __PACKAGE__->register_method ({
> my $id = extract_param($param, 'id');
> my $digest = extract_param($param, 'digest');
> my $delete = extract_param($param, 'delete');
> + my ($guest_id) = PVE::ReplicationConfig::parse_replication_job_id($id);
>
> my $code = sub {
> my $cfg = PVE::ReplicationConfig->new();
> @@ -199,7 +203,9 @@ __PACKAGE__->register_method ({
> $cfg->write();
> };
>
> - PVE::ReplicationConfig::lock($code);
> + PVE::GuestHelpers::guest_migration_lock($guest_id, 10, sub {
> + PVE::ReplicationConfig::lock($code);
> + });
>
> return undef;
> }});
> @@ -237,10 +243,12 @@ __PACKAGE__->register_method ({
>
> my $rpcenv = PVE::RPCEnvironment::get();
>
> + my $id = extract_param($param, 'id');
> + my ($guest_id) = PVE::ReplicationConfig::parse_replication_job_id($id);
> +
> my $code = sub {
> my $cfg = PVE::ReplicationConfig->new();
>
> - my $id = $param->{id};
> if ($param->{force}) {
> raise_param_exc({ 'keep' => "conflicts with parameter 'force'" }) if $param->{keep};
> delete $cfg->{ids}->{$id};
> @@ -262,7 +270,9 @@ __PACKAGE__->register_method ({
> $cfg->write();
> };
>
> - PVE::ReplicationConfig::lock($code);
> + PVE::GuestHelpers::guest_migration_lock($guest_id, 10, sub {
> + PVE::ReplicationConfig::lock($code);
> + });
>
> return undef;
> }});
> --
> 2.20.1
>
>
prev parent reply other threads:[~2020-08-03 7:49 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-30 11:29 Fabian Ebner
2020-07-30 11:29 ` [pve-devel] [PATCH qemu-server 2/3] Repeat check for replication target in locked section Fabian Ebner
2020-07-30 11:29 ` [pve-devel] [PATCH/RFC qemu-server 3/3] Fix checks for transfering replication state/switching job target Fabian Ebner
2020-08-03 7:11 ` [pve-devel] [PATCH manager 1/3] Hold the guest migration lock when changing the replication config Fabian Ebner
2020-08-03 7:49 ` Fabian Grünbichler [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1596440280.7cx8h2dp61.astroid@nora.none \
--to=f.gruenbichler@proxmox.com \
--cc=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox