public inbox for pve-user@lists.proxmox.com
 help / color / mirror / Atom feed
* [PVE-User] Hardware watchdog for standalone server...
@ 2024-08-19 13:39 Marco Gaiarin
  2024-08-19 14:17 ` dORSY via pve-user
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Marco Gaiarin @ 2024-08-19 13:39 UTC (permalink / raw)
  To: pve-user


The use of a watchdog device (preferibly hardware, if not available,
softdog) it is clear from:

	https://pve.proxmox.com/wiki/High_Availability

and in particular:

	https://pve.proxmox.com/wiki/High_Availability#ha_manager_fencing

where you can set (hardware) device module in /etc/default/pve-ha-manager .


But for a standalone server (or a cluster with no HA enabled) this still
apply?! EG, /usr/sbin/watchdog-mux is a valid and general watchdog service,
or better use insted some other services (like 'watchdog' daemon, or
watchdog support bultin in systemd)?


Thanks.

-- 



_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PVE-User] Hardware watchdog for standalone server...
  2024-08-19 13:39 [PVE-User] Hardware watchdog for standalone server Marco Gaiarin
@ 2024-08-19 14:17 ` dORSY via pve-user
  2024-08-19 14:47 ` Alwin Antreich via pve-user
       [not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
  2 siblings, 0 replies; 5+ messages in thread
From: dORSY via pve-user @ 2024-08-19 14:17 UTC (permalink / raw)
  To: Proxmox VE user list; +Cc: dORSY

[-- Attachment #1: Type: message/rfc822, Size: 8373 bytes --]

From: dORSY <dorsyka@yahoo.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Hardware watchdog for standalone server...
Date: Mon, 19 Aug 2024 14:17:04 +0000 (UTC)
Message-ID: <1084462757.5000300.1724077024331@mail.yahoo.com>

why wouldn't?if a hw watchdog times out (watchdogd does not reply in time) it would restart the server.that is a common thing, even standalone.

Yahoo Mail: Search, Organize, Conquer 
 
  On Mon, Aug 19, 2024 at 16:10, Marco Gaiarin<gaio@lilliput.linux.it> wrote:   
The use of a watchdog device (preferibly hardware, if not available,
softdog) it is clear from:

    https://pve.proxmox.com/wiki/High_Availability

and in particular:

    https://pve.proxmox.com/wiki/High_Availability#ha_manager_fencing

where you can set (hardware) device module in /etc/default/pve-ha-manager .


But for a standalone server (or a cluster with no HA enabled) this still
apply?! EG, /usr/sbin/watchdog-mux is a valid and general watchdog service,
or better use insted some other services (like 'watchdog' daemon, or
watchdog support bultin in systemd)?


Thanks.

-- 



_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

  

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PVE-User] Hardware watchdog for standalone server...
  2024-08-19 13:39 [PVE-User] Hardware watchdog for standalone server Marco Gaiarin
  2024-08-19 14:17 ` dORSY via pve-user
@ 2024-08-19 14:47 ` Alwin Antreich via pve-user
       [not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
  2 siblings, 0 replies; 5+ messages in thread
From: Alwin Antreich via pve-user @ 2024-08-19 14:47 UTC (permalink / raw)
  To: gaio; +Cc: Alwin Antreich, Proxmox VE user list

[-- Attachment #1: Type: message/rfc822, Size: 5093 bytes --]

From: "Alwin Antreich" <alwin@antreich.com>
To: gaio@lilliput.linux.it
Cc: "Proxmox VE user list" <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] Hardware watchdog for standalone server...
Date: Mon, 19 Aug 2024 14:47:30 +0000
Message-ID: <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>

Hallo Marco,



August 19, 2024 at 3:39 PM, "Marco Gaiarin" <gaio@lilliput.linux.it> wrote:



> 
> The use of a watchdog device (preferibly hardware, if not available,
> softdog) it is clear from:
> 
>  https://pve.proxmox.com/wiki/High_Availability
> 
> and in particular:
> 
>  https://pve.proxmox.com/wiki/High_Availability#ha_manager_fencing
> 
> where you can set (hardware) device module in /etc/default/pve-ha-manager .
> 
> But for a standalone server (or a cluster with no HA enabled) this still
> apply?! EG, /usr/sbin/watchdog-mux is a valid and general watchdog service,
> or better use insted some other services (like 'watchdog' daemon, or
> watchdog support bultin in systemd)?
> 
Dow you want to make your VM/CT HA? That only works reliably with 3x nodes for quorum (or 2x nodes & qdevice) [0].

But if you want to reset a node if it doesn't spin up a watchdog (doesn't respond), then you could use systemd watchdog[1]. Though I'd like to add that resetting nodes continuously (issue is reoccurring), increases the chances for data corruption.

Cheers,
Alwin

[0] https://pve.proxmox.com/wiki/High_Availability#_requirements
[1] https://0pointer.de/blog/projects/watchdog.html

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PVE-User] Hardware watchdog for standalone server...
       [not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
@ 2024-08-19 15:42   ` Marco Gaiarin
  2024-08-19 19:10     ` dORSY via pve-user
  0 siblings, 1 reply; 5+ messages in thread
From: Marco Gaiarin @ 2024-08-19 15:42 UTC (permalink / raw)
  To: pve-user

Mandi! Alwin Antreich
  In chel di` si favelave...

> Dow you want to make your VM/CT HA? That only works reliably with 3x nodes for
> quorum (or 2x nodes & qdevice) [0].

I know. No. I'm speaking about standalone servers.


> But if you want to reset a node if it doesn't spin up a watchdog (doesn't
> respond), then you could use systemd watchdog[1].

I'm saying EXACTLY that (also for dORSY): i need some sort of watchdog daemon,
or the HA watchdog daemon 'watchdog-mux' works even in standalone setup?


> Though I'd like to add that
> resetting nodes continuously (issue is reoccurring), increases the chances for
> data corruption.

If there's some hardware or other trouble that trigger watchdog, typically
the box is just TOFU and a reset cannot do more harm... at least this is my
experience.

-- 


_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PVE-User] Hardware watchdog for standalone server...
  2024-08-19 15:42   ` Marco Gaiarin
@ 2024-08-19 19:10     ` dORSY via pve-user
  0 siblings, 0 replies; 5+ messages in thread
From: dORSY via pve-user @ 2024-08-19 19:10 UTC (permalink / raw)
  To: Proxmox VE user list, Marco Gaiarin; +Cc: dORSY

[-- Attachment #1: Type: message/rfc822, Size: 9329 bytes --]

From: dORSY <dorsyka@yahoo.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>,  Marco Gaiarin <gaio@lilliput.linux.it>
Subject: Re: [PVE-User] Hardware watchdog for standalone server...
Date: Mon, 19 Aug 2024 19:10:48 +0000 (UTC)
Message-ID: <364231406.5096226.1724094648740@mail.yahoo.com>

don't really get what you want/need exactly, but for single hosts i'd start at 
bmc-watchdog(8) - Linux man page  
|  
|   
|   
|   |    |

   |

  |
|  
|   |  
bmc-watchdog(8) - Linux man page
 
bmc-watchdog controls a Baseboard Management Controller (BMC) watchdog timer. The bmc-watchdog tool typically executes as a cronjob or daemon to manage the ...
  |   |

  |

  |

  
it needs hw support to work (ipmi)i last used a daemon like this on freeBSD storage servers.hope this helps. 
 
  On Mon, Aug 19, 2024 at 17:45, Marco Gaiarin<gaio@lilliput.linux.it> wrote:   Mandi! Alwin Antreich
  In chel di` si favelave...

> Dow you want to make your VM/CT HA? That only works reliably with 3x nodes for
> quorum (or 2x nodes & qdevice) [0].

I know. No. I'm speaking about standalone servers.


> But if you want to reset a node if it doesn't spin up a watchdog (doesn't
> respond), then you could use systemd watchdog[1].

I'm saying EXACTLY that (also for dORSY): i need some sort of watchdog daemon,
or the HA watchdog daemon 'watchdog-mux' works even in standalone setup?


> Though I'd like to add that
> resetting nodes continuously (issue is reoccurring), increases the chances for
> data corruption.

If there's some hardware or other trouble that trigger watchdog, typically
the box is just TOFU and a reset cannot do more harm... at least this is my
experience.

-- 


_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

  

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-08-19 19:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-19 13:39 [PVE-User] Hardware watchdog for standalone server Marco Gaiarin
2024-08-19 14:17 ` dORSY via pve-user
2024-08-19 14:47 ` Alwin Antreich via pve-user
     [not found] ` <11e39814096a53c52422b9f2114719dd5ad73090@antreich.com>
2024-08-19 15:42   ` Marco Gaiarin
2024-08-19 19:10     ` dORSY via pve-user

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal