all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Lukas Sichert" <l.sichert@proxmox.com>
To: "Thomas Lamprecht" <t.lamprecht@proxmox.com>,
	<pve-devel@lists.proxmox.com>
Subject: Re: [PATCH manager v2] fix #4130: external metric: better handle failed connections
Date: Thu, 07 May 2026 15:08:00 +0200	[thread overview]
Message-ID: <DICGUWH26MZ1.14KA1HME4NHBN@proxmox.com> (raw)
In-Reply-To: <e50d34ae-4277-407c-8bd8-e9eb372c28ee@proxmox.com>

Thanks for looking into it. Comments are inline.

On 2026-05-06 19:00, Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:

> Am 19.02.26 um 13:34 schrieb Lukas Sichert:
>> When an external metric server configured to use TCP is unreachable, the
>> storage and VM indicators of the cluster nodes in the UI turn gray. This
>> is because, currently, a failed connection attempt raises an unhandled
>> exception, which aborts the status update flow.  As the connection
>> attempts happen at the beginning of the update process, status
>> information is then not broadcasted within the system or across the
>> cluster. After five minutes without updates, the frontend marks the
>> indicators as gray.
>> 
>> To catch connection errors, wrap connection establishment in an eval
>> block. The implementation ensures that other connections to external
>> metric servers are still established, even if one fails.
>> 
>> Signed-off-by: Lukas Sichert <l.sichert@proxmox.com>
>> ---
>> 
>> Notes:
>>     changes from v1 to v2:
>>     -add the SafeSyslog import required for syslog()
>>     -correct bug ID: #4911 -> #4130
>>     -move the push operation outside the eval block as suggested by Thomas
>>     
>>     Regarding catching the errors at a higher level: Since this function
>>     is iterated through the plugins, not catching the error here would mean,
>>     that not all the plugins are checked.
>> 
>>  PVE/ExtMetric.pm | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>> 
>> diff --git a/PVE/ExtMetric.pm b/PVE/ExtMetric.pm
>> index ebc2817b..18815efd 100644
>> --- a/PVE/ExtMetric.pm
>> +++ b/PVE/ExtMetric.pm
>> @@ -7,6 +7,7 @@ use PVE::Status::Plugin;
>>  use PVE::Status::Graphite;
>>  use PVE::Status::InfluxDB;
>>  use PVE::Status::OpenTelemetry;
>> +use PVE::SafeSyslog;
>>  
>>  PVE::Status::Graphite->register();
>>  PVE::Status::InfluxDB->register();
>> @@ -52,8 +53,12 @@ sub transactions_start {
>>          $cfg,
>>          sub {
>>              my ($plugin, $id, $plugin_config) = @_;
>> -
>> -            my $connection = $plugin->_connect($plugin_config, $id);
>> +            
>> +            my $connection = eval { $plugin->_connect($plugin_config, $id);}; 
>
> there are various whitespace/code format issues here, please run the top-level
> "make tidy" target or call promxox-perltidy manually on the files to fix this.
Thank you for pointing it out. I will fix it in a v3.
>
>> +            if (my $err = $@) {
>> +                syslog( "warning", "connection for plugin '$id' failed: $err");
>> +                return;
>
> This now returns an undef for transation, so the call sides in pvestatd probably
> need to be adapted too to:
>
> if (defined(my $transactions = PVE::ExtMetric::transactions_start($status_cfg))) {
>     # do something with $transaction
> }
>
> As otherwise this could cause warnings about accessing an undef value.

The '$transactions' variable is initialized as an empty array earlier.
The 'return;' is inside the closure passed to 'foreach_plug', so it only
returns from that closure and prevents the 'push @$transactions, ...'
below from being executed for the current plugin. The 'foreach_plug'
loop then continues with the next plugin in '$cfg'. After 'foreach_plug'
finishes, '$transactions' is returned from 'transactions_start'.
Please tell me if I am misunderstanding some Perl closure intricacy
here.

>
>> +            }
>>  
>>              push @$transactions,
>>                  {





      reply	other threads:[~2026-05-07 13:08 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-19 12:34 [PATCH manager v2] fix #4130: external metric: better handle failed connections Lukas Sichert
2026-05-06 17:00 ` Thomas Lamprecht
2026-05-07 13:08   ` Lukas Sichert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DICGUWH26MZ1.14KA1HME4NHBN@proxmox.com \
    --to=l.sichert@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal