* [PVE-User] Hardware watchdog for standalone server...
@ 2024-08-19 13:39 Marco Gaiarin
2024-08-19 14:17 ` dORSY via pve-user
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Marco Gaiarin @ 2024-08-19 13:39 UTC (permalink / raw)
To: pve-user
The use of a watchdog device (preferibly hardware, if not available,
softdog) it is clear from:
https://pve.proxmox.com/wiki/High_Availability
and in particular:
https://pve.proxmox.com/wiki/High_Availability#ha_manager_fencing
where you can set (hardware) device module in /etc/default/pve-ha-manager .
But for a standalone server (or a cluster with no HA enabled) this still
apply?! EG, /usr/sbin/watchdog-mux is a valid and general watchdog service,
or better use insted some other services (like 'watchdog' daemon, or
watchdog support bultin in systemd)?
Thanks.
--
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PVE-User] Hardware watchdog for standalone server...
2024-08-19 13:39 [PVE-User] Hardware watchdog for standalone server Marco Gaiarin
@ 2024-08-19 14:17 ` dORSY via pve-user
2024-08-19 14:47 ` Alwin Antreich via pve-user
[not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
2 siblings, 0 replies; 5+ messages in thread
From: dORSY via pve-user @ 2024-08-19 14:17 UTC (permalink / raw)
To: Proxmox VE user list; +Cc: dORSY
[-- Attachment #1: Type: message/rfc822, Size: 8373 bytes --]
From: dORSY <dorsyka@yahoo.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Hardware watchdog for standalone server...
Date: Mon, 19 Aug 2024 14:17:04 +0000 (UTC)
Message-ID: <1084462757.5000300.1724077024331@mail.yahoo.com>
why wouldn't?if a hw watchdog times out (watchdogd does not reply in time) it would restart the server.that is a common thing, even standalone.
Yahoo Mail: Search, Organize, Conquer
On Mon, Aug 19, 2024 at 16:10, Marco Gaiarin<gaio@lilliput.linux.it> wrote:
The use of a watchdog device (preferibly hardware, if not available,
softdog) it is clear from:
https://pve.proxmox.com/wiki/High_Availability
and in particular:
https://pve.proxmox.com/wiki/High_Availability#ha_manager_fencing
where you can set (hardware) device module in /etc/default/pve-ha-manager .
But for a standalone server (or a cluster with no HA enabled) this still
apply?! EG, /usr/sbin/watchdog-mux is a valid and general watchdog service,
or better use insted some other services (like 'watchdog' daemon, or
watchdog support bultin in systemd)?
Thanks.
--
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PVE-User] Hardware watchdog for standalone server...
2024-08-19 13:39 [PVE-User] Hardware watchdog for standalone server Marco Gaiarin
2024-08-19 14:17 ` dORSY via pve-user
@ 2024-08-19 14:47 ` Alwin Antreich via pve-user
[not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
2 siblings, 0 replies; 5+ messages in thread
From: Alwin Antreich via pve-user @ 2024-08-19 14:47 UTC (permalink / raw)
To: gaio; +Cc: Alwin Antreich, Proxmox VE user list
[-- Attachment #1: Type: message/rfc822, Size: 5093 bytes --]
From: "Alwin Antreich" <alwin@antreich.com>
To: gaio@lilliput.linux.it
Cc: "Proxmox VE user list" <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Hardware watchdog for standalone server...
Date: Mon, 19 Aug 2024 14:47:30 +0000
Message-ID: <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
Hallo Marco,
August 19, 2024 at 3:39 PM, "Marco Gaiarin" <gaio@lilliput.linux.it> wrote:
>
> The use of a watchdog device (preferibly hardware, if not available,
> softdog) it is clear from:
>
> https://pve.proxmox.com/wiki/High_Availability
>
> and in particular:
>
> https://pve.proxmox.com/wiki/High_Availability#ha_manager_fencing
>
> where you can set (hardware) device module in /etc/default/pve-ha-manager .
>
> But for a standalone server (or a cluster with no HA enabled) this still
> apply?! EG, /usr/sbin/watchdog-mux is a valid and general watchdog service,
> or better use insted some other services (like 'watchdog' daemon, or
> watchdog support bultin in systemd)?
>
Dow you want to make your VM/CT HA? That only works reliably with 3x nodes for quorum (or 2x nodes & qdevice) [0].
But if you want to reset a node if it doesn't spin up a watchdog (doesn't respond), then you could use systemd watchdog[1]. Though I'd like to add that resetting nodes continuously (issue is reoccurring), increases the chances for data corruption.
Cheers,
Alwin
[0] https://pve.proxmox.com/wiki/High_Availability#_requirements
[1] https://0pointer.de/blog/projects/watchdog.html
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>]
* Re: [PVE-User] Hardware watchdog for standalone server...
[not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
@ 2024-08-19 15:42 ` Marco Gaiarin
2024-08-19 19:10 ` dORSY via pve-user
0 siblings, 1 reply; 5+ messages in thread
From: Marco Gaiarin @ 2024-08-19 15:42 UTC (permalink / raw)
To: pve-user
Mandi! Alwin Antreich
In chel di` si favelave...
> Dow you want to make your VM/CT HA? That only works reliably with 3x nodes for
> quorum (or 2x nodes & qdevice) [0].
I know. No. I'm speaking about standalone servers.
> But if you want to reset a node if it doesn't spin up a watchdog (doesn't
> respond), then you could use systemd watchdog[1].
I'm saying EXACTLY that (also for dORSY): i need some sort of watchdog daemon,
or the HA watchdog daemon 'watchdog-mux' works even in standalone setup?
> Though I'd like to add that
> resetting nodes continuously (issue is reoccurring), increases the chances for
> data corruption.
If there's some hardware or other trouble that trigger watchdog, typically
the box is just TOFU and a reset cannot do more harm... at least this is my
experience.
--
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PVE-User] Hardware watchdog for standalone server...
2024-08-19 15:42 ` Marco Gaiarin
@ 2024-08-19 19:10 ` dORSY via pve-user
0 siblings, 0 replies; 5+ messages in thread
From: dORSY via pve-user @ 2024-08-19 19:10 UTC (permalink / raw)
To: Proxmox VE user list, Marco Gaiarin; +Cc: dORSY
[-- Attachment #1: Type: message/rfc822, Size: 9329 bytes --]
From: dORSY <dorsyka@yahoo.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>, Marco Gaiarin <gaio@lilliput.linux.it>
Subject: Re: [PVE-User] Hardware watchdog for standalone server...
Date: Mon, 19 Aug 2024 19:10:48 +0000 (UTC)
Message-ID: <364231406.5096226.1724094648740@mail.yahoo.com>
don't really get what you want/need exactly, but for single hosts i'd start at
bmc-watchdog(8) - Linux man page
|
|
|
| | |
|
|
|
| |
bmc-watchdog(8) - Linux man page
bmc-watchdog controls a Baseboard Management Controller (BMC) watchdog timer. The bmc-watchdog tool typically executes as a cronjob or daemon to manage the ...
| |
|
|
it needs hw support to work (ipmi)i last used a daemon like this on freeBSD storage servers.hope this helps.
On Mon, Aug 19, 2024 at 17:45, Marco Gaiarin<gaio@lilliput.linux.it> wrote: Mandi! Alwin Antreich
In chel di` si favelave...
> Dow you want to make your VM/CT HA? That only works reliably with 3x nodes for
> quorum (or 2x nodes & qdevice) [0].
I know. No. I'm speaking about standalone servers.
> But if you want to reset a node if it doesn't spin up a watchdog (doesn't
> respond), then you could use systemd watchdog[1].
I'm saying EXACTLY that (also for dORSY): i need some sort of watchdog daemon,
or the HA watchdog daemon 'watchdog-mux' works even in standalone setup?
> Though I'd like to add that
> resetting nodes continuously (issue is reoccurring), increases the chances for
> data corruption.
If there's some hardware or other trouble that trigger watchdog, typically
the box is just TOFU and a reset cannot do more harm... at least this is my
experience.
--
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-08-19 19:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-19 13:39 [PVE-User] Hardware watchdog for standalone server Marco Gaiarin
2024-08-19 14:17 ` dORSY via pve-user
2024-08-19 14:47 ` Alwin Antreich via pve-user
[not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
2024-08-19 15:42 ` Marco Gaiarin
2024-08-19 19:10 ` dORSY via pve-user
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox