public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Mira Limbeck <m.limbeck@proxmox.com>
To: pve-devel@lists.proxmox.com
Subject: Re: [pve-devel] [PATCH storage] iscsi: disable Open-iSCSI login retries to avoid blocking pvestatd
Date: Fri, 11 Oct 2024 13:20:06 +0200	[thread overview]
Message-ID: <b896d78b-a8ee-4de4-b682-ce67098a9dc0@proxmox.com> (raw)
In-Reply-To: <20241011093744.25545-1-f.weber@proxmox.com>

On 10/11/24 11:37, Friedrich Weber wrote:
> Since 90c1b10 ("fix #254: iscsi: add support for multipath targets"),
> iSCSI storage activation checks whether a session exists for each
> discovered portal. If there is a discovered portal without a session,
> it performs a discovery and login in the hope of establishing a
> session to the portal. If the portal is unreachable when trying to log
> in, Open-iSCSI's default behavior is to retry for up to 2 minutes, as
> explained in /etc/iscsi/iscid.conf:
> 
>> # The default node.session.initial_login_retry_max is 8 and
>> # node.conn[0].timeo.login_timeout is 15 so we have:
>> #
>> # node.conn[0].timeo.login_timeout * \
>> node.session.initial_login_retry_max = 120s
> 
> If pvestatd is activating the storage, it will be blocked during that
> time, which is undesirable. This is particularly unfortunate if the
> target announces portals that the host permanently cannot reach. In
> that case, every pvestatd iteration will take 2 minutes. While it can
> be argued that such setups are misconfigured, it is still desirable to
> keep the fallout of that misconfiguration as low as possible.
> 
> In order to reduce the time Open-iSCSI tries to log in, instruct
> Open-ISCSI to not perform login retries for that target. For this, set
> node.session.initial_login_retry_max for the target to 0. This setting
> is stored in Open-iSCSI's records under /etc/iscsi/nodes. As these
> records are overwritten with the defaults from /etc/iscsi/iscsid.conf
> on discovery, the setting needs to be applied after discovery.
> 
> With this setting, one login attempt should take at most 15 seconds.
> This is still higher than pvestatd's iteration time of 10 seconds, but
> more tolerable. Logins will still be continuously retried by pvestatd
> in every iteration until there is a session to each discovered portal.
> 
> Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
> ---
>  src/PVE/Storage/ISCSIPlugin.pm | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/PVE/Storage/ISCSIPlugin.pm b/src/PVE/Storage/ISCSIPlugin.pm
> index 2bdd9a2..efd9de4 100644
> --- a/src/PVE/Storage/ISCSIPlugin.pm
> +++ b/src/PVE/Storage/ISCSIPlugin.pm
> @@ -132,6 +132,14 @@ sub iscsi_login {
>      eval { iscsi_discovery($portals); };
>      warn $@ if $@;
>  
> +    # Disable retries to avoid blocking pvestatd for too long, next iteration will retry anyway
> +    eval {
> +	my $cmd = [$ISCSIADM, '--mode', 'node', '--targetname', $target, '--op', 'update',
> +	    '--name', 'node.session.initial_login_retry_max', '--value', '0'];
As shortly discussed off-list, this should probably follow a similar
style as the `Wrapping Arguments` section in the Perl Style Guide, but
grouping option and value together in the same line?
https://pve.proxmox.com/wiki/Perl_Style_Guide#Wrapping_Arguments

> +	run_command($cmd);
> +    };
> +    warn $@ if $@;
> +
>      run_command([$ISCSIADM, '--mode', 'node', '--targetname',  $target, '--login']);
>  }
>  

Tested this with 4 portals by disconnecting 2. With this patch the
pvestatd update time was at ~30 seconds, matching 2 failed logins.
Without the patch it was ~484 seconds.

Since a login fails in 7 seconds, the old behavior actually did more
than 8 retries, see the comment for `initial_login_retry_max`:
# Note that if the login fails
# quickly (before node.conn[0].timeo.login_timeout fires) because the
network
# layer or the target returns an error, iscsid may retry the login more than
# node.session.initial_login_retry_max times.

So especially for the cases of `no route to host` this should improve
the update time significantly for multiple portals where some are never
reachable.


Consider this patch:
Tested-by: Mira Limbeck <m.limbeck@proxmox.com>
Reviewed-by: Mira Limbeck <m.limbeck@proxmox.com>


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  reply	other threads:[~2024-10-11 11:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-11  9:37 Friedrich Weber
2024-10-11 11:20 ` Mira Limbeck [this message]
2024-10-11 13:01   ` Friedrich Weber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b896d78b-a8ee-4de4-b682-ce67098a9dc0@proxmox.com \
    --to=m.limbeck@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal