public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH ha-manager v3 0/6] watchdog-mux: sync log to disk before and after expiring
@ 2025-07-04 13:38 Maximiliano Sandoval
  2025-07-04 13:38 ` [pve-devel] [PATCH ha-manager v3 1/6] watchdog-mux: Use #define for 60s timeout Maximiliano Sandoval
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Maximiliano Sandoval @ 2025-07-04 13:38 UTC (permalink / raw)
  To: pve-devel

Without a clear-cut message in the log, it is very hard to provide a definitive
answer to whether a host fenced or not. In some cases the journal on the disk
can be missing up to 2 minutes since its last logged entry and the time where
another node detects the corosync link is down, with such a gap, the fenced node
would not even record that it lost conenction and it is not possible to
fully-determine if the node was fenced or not.

This series:
 - adds a second warning 10 seconds before the watchdog expires
 - syncs the journal to disk after the warning was issued
 - syncs the journal to disk after the watchdog expires
 - allows for watchdog-mux to exit(EXIT_SUCCESS) before the fence (new in v3)

Differences from v2:
 - Instead of explicitly adding a call to sync the journal after we disable
   updates, we help the process breaking out of the loop, allowing it to reach
   the code that would call the sync and then exit()

Differences from v1:
 - Define the warning cuttoff based on the 60 second timeout
 - Change log messages and constant names
 - When not immediately fencing, run journal sync in double fork

Maximiliano Sandoval (6):
  watchdog-mux: Use #define for 60s timeout
  watchdog-mux: split if block in two if blocks
  watchdog-mux: warn when about to expire
  watchdog-mux: sync journal right after fence warning
  watchdog-mux: break out of loop when updates are disabled
  watchdog-mux: Remove wrapping if guard

 src/watchdog-mux.c | 61 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 49 insertions(+), 12 deletions(-)

-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-07-17 16:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-04 13:38 [pve-devel] [PATCH ha-manager v3 0/6] watchdog-mux: sync log to disk before and after expiring Maximiliano Sandoval
2025-07-04 13:38 ` [pve-devel] [PATCH ha-manager v3 1/6] watchdog-mux: Use #define for 60s timeout Maximiliano Sandoval
2025-07-04 13:38 ` [pve-devel] [PATCH ha-manager v3 2/6] watchdog-mux: split if block in two if blocks Maximiliano Sandoval
2025-07-04 13:38 ` [pve-devel] [PATCH ha-manager v3 3/6] watchdog-mux: warn when about to expire Maximiliano Sandoval
2025-07-04 13:39 ` [pve-devel] [PATCH ha-manager v3 4/6] watchdog-mux: sync journal right after fence warning Maximiliano Sandoval
2025-07-04 13:39 ` [pve-devel] [PATCH ha-manager v3 5/6] watchdog-mux: break out of loop when updates are disabled Maximiliano Sandoval
2025-07-04 13:39 ` [pve-devel] [PATCH ha-manager v3 6/6] watchdog-mux: Remove wrapping if guard Maximiliano Sandoval
2025-07-17 16:00 ` [pve-devel] applied: [PATCH ha-manager v3 0/6] watchdog-mux: sync log to disk before and after expiring Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal