From: Fiona Ebner <f.ebner@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH manager] services: add restart on-failure to pvescheduler, pvestatd and spiceproxy
Date: Mon, 26 May 2025 15:37:17 +0200 [thread overview]
Message-ID: <2a2f7415-8156-493a-9468-403a8177cb3d@proxmox.com> (raw)
In-Reply-To: <1561ec4d-9935-4c70-860b-3da84f70b8c0@proxmox.com>
Am 26.05.25 um 12:38 schrieb Thomas Lamprecht:
> Am 26.05.25 um 10:45 schrieb Fiona Ebner:
>> Same rationale as 4fd2027e ("service: add restart on-failure to
>> pveproxy and pvedaemon") which added the setting for the pveproxy and
>> pvedaemon services.
>>
>> Suggested for pvestatd in the community forum:
>> https://forum.proxmox.com/threads/165597/post-773210
>
> Fine by me in general, but might be good to recheck if the overall behavior
> of the mechanism makes sense, especially with the default RestartSec=100ms
> (man systemd.service) and the default StartLimitBurst=5 (man systemd.unit),
> which basically means that if the problematic condition is still present,
> it will be restart 5 times in a total span of 500 ms, and then not get
> restarted anymore. The StartLimitIntervalSec=10s default is also a limiting
> factor, but when the services fails fast early it's unlikely to be hit.
>
> Maybe increasing the interval between restarts a bit (0.5 to 1s?) and/or
> the burst rate (10 to 20 times) might make sense to survive more temporary
> issues would make more sense – there certainly isn't one size fits all here,
> but 5 times in 500 ms is IMO not that ideal for our services here.
>
> That said, applying this now should not make the status quo worse, beside
> filling the logs with restart failures, making the limited output included
> in the systemctl status commands less useful, but that's hardly a real
> problem.
Yes, this can be better fine-tuned.
Should there be a limit? AFAIU, if we pick e.g. StartLimitBurst=11,
RestartSec=1, then with the default StartLimitIntervalSec=10s, the limit
will never be hit and the service would be tried to be restarted
perpetually. Do we want to keep a limit by also increasing the
StartLimitIntervalSec value accordingly?
I suppose this should be adapted for pvedaemon and pveproxy too then?
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
prev parent reply other threads:[~2025-05-26 13:37 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-26 8:45 Fiona Ebner
2025-05-26 10:38 ` Thomas Lamprecht
2025-05-26 13:37 ` Fiona Ebner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2a2f7415-8156-493a-9468-403a8177cb3d@proxmox.com \
--to=f.ebner@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
--cc=t.lamprecht@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.