From: Thomas Lamprecht <t.lamprecht@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>, pdm-devel@lists.proxmox.com
Subject: Re: [RFC PATCH datacenter-manager 1/4] server: connection: multi client: use correct client error for retrying
Date: Sat, 30 May 2026 01:25:22 +0200 [thread overview]
Message-ID: <ebea3bce-d6e9-418d-bd51-cd9832be084d@proxmox.com> (raw)
In-Reply-To: <20260529133026.3149896-2-d.csapak@proxmox.com>
Am 29.05.26 um 15:30 schrieb Dominik Csapak:
> Since commit
> 8e3b1f06 (client: split out Connect error and factor out error classification)
> in the proxmox workspace, Connection errors are split out to it's own
> type of `proxmox_client::Error`s. Since then, every connection error
> was returned normally and not retried via other hosts from the remote
> config.
>
> Use the correct error type to only retry when the actual connection is
> not working, not if the api or the server returns an issue.
>
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> Not sure if just changing this is a good idea, or if we want to catch
> both connect and client errors? tracking only connect errors seemed
> the correct way to me here, but maybe someone has a different opinion.
This is not exactly how it was intended and also not fully correct, but
many thanks for reminding me about having to pick this up proper.
This was triggered originally by the occasionally noticed multi client
error pop up I saw on long living tab with a PDM open, e.g. after having
been idle for a while and then noticing that it basically always affected a
remote with a single node (but did not a thorough statistical probe here).
When checking the multi client's retry logic with that in mind I noticed that
any stale connection or blip for a remote with just one node made the retry
logic useless. And when wanting to fix that I also stumbled into the client
here and the error types.
Anyway, pushed now three commits that should fix that proper, one for the
client [0] as there was still an issue left, and then two commits in PDM that
fix the error usage [1] (replacing your patch here) and implement a final
fallback to the first node [2], that short-circuits to the sole node for
single-node remotes immediately and for clusters it retries the first, in
the (rare but not theoretic) case all other tracked nodes are genuinely down
and the first one just had a network blip it will help and has no real
downside (given my implementation is correct naturally ;P).
[0]: https://git.proxmox.com/?p=proxmox.git;a=commitdiff;h=52cc43f90b95d650286a568ed737264631f333e9
[1]: https://git.proxmox.com/?p=proxmox-datacenter-manager.git;a=commitdiff;h=2807bc729e131ea2cbce247bfa39c2e3db3f7fa6
[2]: https://git.proxmox.com/?p=proxmox-datacenter-manager.git;a=commitdiff;h=59135da87a5a0df2b0d3a0c82e8f282fb9ef0657
next prev parent reply other threads:[~2026-05-29 23:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 13:30 [PATCH datacenter-manager 0/4] implement back-off mechanism for Dominik Csapak
2026-05-29 13:30 ` [RFC PATCH datacenter-manager 1/4] server: connection: multi client: use correct client error for retrying Dominik Csapak
2026-05-29 23:25 ` Thomas Lamprecht [this message]
2026-05-29 13:30 ` [PATCH datacenter-manager 2/4] server: remote cache: prepare for back-off mechanism Dominik Csapak
2026-05-29 23:40 ` Thomas Lamprecht
2026-05-29 13:30 ` [PATCH datacenter-manager 3/4] server: connection: multi-client: use back-off state from remote cache Dominik Csapak
2026-05-29 13:30 ` [PATCH datacenter-manager 4/4] server: pbs client: rework to use the back-off mechanism " Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ebea3bce-d6e9-418d-bd51-cd9832be084d@proxmox.com \
--to=t.lamprecht@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pdm-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox