public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Lukas Sichert" <l.sichert@proxmox.com>
To: "Thomas Lamprecht" <t.lamprecht@proxmox.com>,
	<pve-devel@lists.proxmox.com>
Subject: Re: [PATCH manager v2] fix #4130: external metric: better handle failed connections
Date: Thu, 07 May 2026 15:08:00 +0200	[thread overview]
Message-ID: <DICGUWH26MZ1.14KA1HME4NHBN@proxmox.com> (raw)
In-Reply-To: <e50d34ae-4277-407c-8bd8-e9eb372c28ee@proxmox.com>

Thanks for looking into it. Comments are inline.

On 2026-05-06 19:00, Thomas Lamprecht <t.lamprecht@proxmox.com> wrote:

> Am 19.02.26 um 13:34 schrieb Lukas Sichert:
>> When an external metric server configured to use TCP is unreachable, the
>> storage and VM indicators of the cluster nodes in the UI turn gray. This
>> is because, currently, a failed connection attempt raises an unhandled
>> exception, which aborts the status update flow.  As the connection
>> attempts happen at the beginning of the update process, status
>> information is then not broadcasted within the system or across the
>> cluster. After five minutes without updates, the frontend marks the
>> indicators as gray.
>> 
>> To catch connection errors, wrap connection establishment in an eval
>> block. The implementation ensures that other connections to external
>> metric servers are still established, even if one fails.
>> 
>> Signed-off-by: Lukas Sichert <l.sichert@proxmox.com>
>> ---
>> 
>> Notes:
>>     changes from v1 to v2:
>>     -add the SafeSyslog import required for syslog()
>>     -correct bug ID: #4911 -> #4130
>>     -move the push operation outside the eval block as suggested by Thomas
>>     
>>     Regarding catching the errors at a higher level: Since this function
>>     is iterated through the plugins, not catching the error here would mean,
>>     that not all the plugins are checked.
>> 
>>  PVE/ExtMetric.pm | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>> 
>> diff --git a/PVE/ExtMetric.pm b/PVE/ExtMetric.pm
>> index ebc2817b..18815efd 100644
>> --- a/PVE/ExtMetric.pm
>> +++ b/PVE/ExtMetric.pm
>> @@ -7,6 +7,7 @@ use PVE::Status::Plugin;
>>  use PVE::Status::Graphite;
>>  use PVE::Status::InfluxDB;
>>  use PVE::Status::OpenTelemetry;
>> +use PVE::SafeSyslog;
>>  
>>  PVE::Status::Graphite->register();
>>  PVE::Status::InfluxDB->register();
>> @@ -52,8 +53,12 @@ sub transactions_start {
>>          $cfg,
>>          sub {
>>              my ($plugin, $id, $plugin_config) = @_;
>> -
>> -            my $connection = $plugin->_connect($plugin_config, $id);
>> +            
>> +            my $connection = eval { $plugin->_connect($plugin_config, $id);}; 
>
> there are various whitespace/code format issues here, please run the top-level
> "make tidy" target or call promxox-perltidy manually on the files to fix this.
Thank you for pointing it out. I will fix it in a v3.
>
>> +            if (my $err = $@) {
>> +                syslog( "warning", "connection for plugin '$id' failed: $err");
>> +                return;
>
> This now returns an undef for transation, so the call sides in pvestatd probably
> need to be adapted too to:
>
> if (defined(my $transactions = PVE::ExtMetric::transactions_start($status_cfg))) {
>     # do something with $transaction
> }
>
> As otherwise this could cause warnings about accessing an undef value.

The '$transactions' variable is initialized as an empty array earlier.
The 'return;' is inside the closure passed to 'foreach_plug', so it only
returns from that closure and prevents the 'push @$transactions, ...'
below from being executed for the current plugin. The 'foreach_plug'
loop then continues with the next plugin in '$cfg'. After 'foreach_plug'
finishes, '$transactions' is returned from 'transactions_start'.
Please tell me if I am misunderstanding some Perl closure intricacy
here.

>
>> +            }
>>  
>>              push @$transactions,
>>                  {





      reply	other threads:[~2026-05-07 13:08 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-19 12:34 [PATCH manager v2] fix #4130: external metric: better handle failed connections Lukas Sichert
2026-05-06 17:00 ` Thomas Lamprecht
2026-05-07 13:08   ` Lukas Sichert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DICGUWH26MZ1.14KA1HME4NHBN@proxmox.com \
    --to=l.sichert@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal