From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <f.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 7E92462886
 for <pve-devel@lists.proxmox.com>; Tue, 22 Feb 2022 10:41:41 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 729EB2279F
 for <pve-devel@lists.proxmox.com>; Tue, 22 Feb 2022 10:41:41 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id E03B822794
 for <pve-devel@lists.proxmox.com>; Tue, 22 Feb 2022 10:41:40 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id A164B46317
 for <pve-devel@lists.proxmox.com>; Tue, 22 Feb 2022 10:41:40 +0100 (CET)
Message-ID: <df346821-3dc3-d65e-f127-4c5e156424a7@proxmox.com>
Date: Tue, 22 Feb 2022 10:41:34 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
 Thunderbird/91.6.0
Content-Language: en-US
To: pve-devel@lists.proxmox.com,
 =?UTF-8?Q?Fabian_Gr=c3=bcnbichler?= <f.gruenbichler@proxmox.com>,
 Thomas Lamprecht <t.lamprecht@proxmox.com>
References: <20220221115828.76012-1-f.ebner@proxmox.com>
 <20220221115828.76012-2-f.ebner@proxmox.com>
From: Fabian Ebner <f.ebner@proxmox.com>
In-Reply-To: <20220221115828.76012-2-f.ebner@proxmox.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.131 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [guesthelpers.pm]
Subject: Re: [pve-devel] [PATCH v3 guest-common 1/1] guest helpers: add
 run_with_replication_guard
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Tue, 22 Feb 2022 09:41:41 -0000

Am 21.02.22 um 12:58 schrieb Fabian Ebner:
> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
> ---
> 
> New in v3.
> 
>  src/PVE/GuestHelpers.pm | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/src/PVE/GuestHelpers.pm b/src/PVE/GuestHelpers.pm
> index 970c460..1183819 100644
> --- a/src/PVE/GuestHelpers.pm
> +++ b/src/PVE/GuestHelpers.pm
> @@ -3,8 +3,9 @@ package PVE::GuestHelpers;
>  use strict;
>  use warnings;
>  
> -use PVE::Tools;
> +use PVE::ReplicationConfig;
>  use PVE::Storage;
> +use PVE::Tools;
>  
>  use POSIX qw(strftime);
>  use Scalar::Util qw(weaken);
> @@ -82,6 +83,18 @@ sub guest_migration_lock {
>      return $res;
>  }
>  
> +sub run_with_replication_guard {
> +    my ($vmid, $timeout, $log, $func, @param) = @_;
> +
> +    my $repl_conf = PVE::ReplicationConfig->new();
> +    if ($repl_conf->check_for_existing_jobs($vmid, 1)) {
> +	$log->("checking/waiting for active replication..") if $log;
> +	guest_migration_lock($vmid, $timeout, $func, @param);

I wonder if we should unconditionally take the lock? If not, we can race
with a newly created replication job:
1. snapshot deletion starts
2. replication job is created
3. replication job starts
4. snapshot deletion runs into 'dataset is busy' error, because snapshot
is used by replication

IIRC Thomas didn't want the log line to be printed when there is no
replication configured, and we can still do that if we go for the
"unconditionally acquire lock" approach (it should get the lock quickly
except in the above edge case), but it would mean checking the
replication config just for that. My suggestion would be to get rid of
this helper, not log anything in the cases with 10s timeout, and log if
replication is configured in the backup case with 600s timeout.

> +    } else {
> +	$func->(@param);
> +    }
> +}
> +
>  sub check_hookscript {
>      my ($volid, $storecfg) = @_;
>