* [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring
@ 2025-05-19 13:09 Maximiliano Sandoval
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 1/3] watchdog: separate if in two parts Maximiliano Sandoval
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Maximiliano Sandoval @ 2025-05-19 13:09 UTC (permalink / raw)
To: pve-devel
It is very hard to provide a definitive answer to whether a host fenced or not.
In some cases the journal on the disk can be missing up to 2 minutes since its
last logged entry and the time where another node detects the corosync link is
down, with such a gap, the fenced node would not even record that it lost
conenction and it is not possible to fully-determine if the node was fenced or
not.
This series:
- adds a second warning 10 seconds before the watchdog expires
- syncs the journal to disk after the warning was issued
- syncs the journal to disk after the watchdog expires
The variable names in the second commit could use some feedback. The way the
warning timeout is defined was arbitrary (10 seconds before the fence).
Maximiliano Sandoval (3):
watchdog: separate if in two parts
watchdog: warn when about to expire
watchdog: sync journal after sending expiration related messages
src/watchdog-mux.c | 40 +++++++++++++++++++++++++++++++++-------
1 file changed, 33 insertions(+), 7 deletions(-)
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH ha-manager 1/3] watchdog: separate if in two parts
2025-05-19 13:09 [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Maximiliano Sandoval
@ 2025-05-19 13:09 ` Maximiliano Sandoval
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire Maximiliano Sandoval
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Maximiliano Sandoval @ 2025-05-19 13:09 UTC (permalink / raw)
To: pve-devel
The sole purpose of this commit is to make the following commit's diff
easier to read.
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
---
src/watchdog-mux.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/src/watchdog-mux.c b/src/watchdog-mux.c
index b4bcc0c..a9017b3 100644
--- a/src/watchdog-mux.c
+++ b/src/watchdog-mux.c
@@ -243,13 +243,11 @@ main(void)
int i;
time_t ctime = time(NULL);
for (i = 0; i < MAX_CLIENTS; i++) {
- if (
- client_list[i].fd != 0
- && client_list[i].time != 0
- && ((ctime - client_list[i].time) > client_watchdog_timeout)
- ) {
- update_watchdog = 0;
- fprintf(stderr, "client watchdog expired - disable watchdog updates\n");
+ if (client_list[i].fd != 0 && client_list[i].time != 0) {
+ if ((ctime - client_list[i].time) > client_watchdog_timeout) {
+ update_watchdog = 0;
+ fprintf(stderr, "client watchdog expired - disable watchdog updates\n");
+ }
}
}
}
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire
2025-05-19 13:09 [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Maximiliano Sandoval
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 1/3] watchdog: separate if in two parts Maximiliano Sandoval
@ 2025-05-19 13:09 ` Maximiliano Sandoval
2025-06-16 8:37 ` Aaron Lauterer
2025-06-17 6:11 ` Thomas Lamprecht
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 3/3] watchdog: sync journal after sending expiration related messages Maximiliano Sandoval
2025-06-16 8:40 ` [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Aaron Lauterer
3 siblings, 2 replies; 8+ messages in thread
From: Maximiliano Sandoval @ 2025-05-19 13:09 UTC (permalink / raw)
To: pve-devel
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
---
src/watchdog-mux.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/src/watchdog-mux.c b/src/watchdog-mux.c
index a9017b3..e14c768 100644
--- a/src/watchdog-mux.c
+++ b/src/watchdog-mux.c
@@ -29,15 +29,24 @@
#define JOURNALCTL_BIN "/bin/journalctl"
+#define CLIENT_WATCHDOG_TIMEOUT_WARNING 50
+
int watchdog_fd = -1;
int watchdog_timeout = 10;
int client_watchdog_timeout = 60;
int update_watchdog = 1;
+enum warning_state_t {
+ NONE,
+ WARNING_ISSUED,
+ CRISIS_AVERTED,
+};
+
typedef struct {
int fd;
time_t time;
int magic_close;
+ enum warning_state_t warning_state;
} wd_client_t;
#define MAX_CLIENTS 100
@@ -54,6 +63,7 @@ alloc_client(int fd, time_t time)
client_list[i].fd = fd;
client_list[i].time = time;
client_list[i].magic_close = 0;
+ client_list[i].warning_state = NONE;
return &client_list[i];
}
}
@@ -244,6 +254,22 @@ main(void)
time_t ctime = time(NULL);
for (i = 0; i < MAX_CLIENTS; i++) {
if (client_list[i].fd != 0 && client_list[i].time != 0) {
+ if (
+ client_list[i].warning_state == WARNING_ISSUED
+ && (ctime - client_list[i].time) <= CLIENT_WATCHDOG_TIMEOUT_WARNING
+ ) {
+ client_list[i].warning_state = CRISIS_AVERTED;
+ fprintf(stderr, "phew, client watchdog was updated before expiring\n");
+ }
+
+ if (
+ client_list[i].warning_state != WARNING_ISSUED
+ && (ctime - client_list[i].time) > CLIENT_WATCHDOG_TIMEOUT_WARNING
+ ) {
+ client_list[i].warning_state = WARNING_ISSUED;
+ fprintf(stderr, "client watchdog is about to expire\n");
+ }
+
if ((ctime - client_list[i].time) > client_watchdog_timeout) {
update_watchdog = 0;
fprintf(stderr, "client watchdog expired - disable watchdog updates\n");
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH ha-manager 3/3] watchdog: sync journal after sending expiration related messages
2025-05-19 13:09 [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Maximiliano Sandoval
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 1/3] watchdog: separate if in two parts Maximiliano Sandoval
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire Maximiliano Sandoval
@ 2025-05-19 13:09 ` Maximiliano Sandoval
2025-06-17 6:21 ` Thomas Lamprecht
2025-06-16 8:40 ` [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Aaron Lauterer
3 siblings, 1 reply; 8+ messages in thread
From: Maximiliano Sandoval @ 2025-05-19 13:09 UTC (permalink / raw)
To: pve-devel
One sync comes after warning that the watchdog is about to expire, and a
second right after the watchdog expires.
To maximize the chances the log will contain entries relevant to a fence
event. This would be extremely useful for detecting whether a node
fenced.
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
---
src/watchdog-mux.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/watchdog-mux.c b/src/watchdog-mux.c
index e14c768..8669b10 100644
--- a/src/watchdog-mux.c
+++ b/src/watchdog-mux.c
@@ -268,11 +268,13 @@ main(void)
) {
client_list[i].warning_state = WARNING_ISSUED;
fprintf(stderr, "client watchdog is about to expire\n");
+ sync_journal_unsafe();
}
if ((ctime - client_list[i].time) > client_watchdog_timeout) {
update_watchdog = 0;
fprintf(stderr, "client watchdog expired - disable watchdog updates\n");
+ sync_journal_unsafe();
}
}
}
--
2.39.5
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire Maximiliano Sandoval
@ 2025-06-16 8:37 ` Aaron Lauterer
2025-06-17 6:11 ` Thomas Lamprecht
1 sibling, 0 replies; 8+ messages in thread
From: Aaron Lauterer @ 2025-06-16 8:37 UTC (permalink / raw)
To: Proxmox VE development discussion, Maximiliano Sandoval
On 2025-05-19 15:09, Maximiliano Sandoval wrote:
> Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> ---
> src/watchdog-mux.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/src/watchdog-mux.c b/src/watchdog-mux.c
> index a9017b3..e14c768 100644
> --- a/src/watchdog-mux.c
> +++ b/src/watchdog-mux.c
> @@ -29,15 +29,24 @@
>
> #define JOURNALCTL_BIN "/bin/journalctl"
>
> +#define CLIENT_WATCHDOG_TIMEOUT_WARNING 50
some comment why 50 is used would be useful. Or alternatively it maybe
could be defined differently a bit further below, using the
client_watchdog_timeout as reference:
client_watchdog_timeout_warning = client_watchdog_timeout - 10;
This way would be clearer then that we want to react to the last 10
seconds before we have the timeout. But that is just some idea.
> +
> int watchdog_fd = -1;
> int watchdog_timeout = 10;
> int client_watchdog_timeout = 60;
> int update_watchdog = 1;
>
> +enum warning_state_t {
> + NONE,
> + WARNING_ISSUED,
> + CRISIS_AVERTED,
I don't like the "CRISIS" vocabulary here. Why not call it
"FENCE_AVERTED" or "HOST_FENCE_AVERTED" ?
It sounds less sensational and refers to what is averted
> +};
> +
> typedef struct {
> int fd;
> time_t time;
> int magic_close;
> + enum warning_state_t warning_state;
> } wd_client_t;
>
> #define MAX_CLIENTS 100
> @@ -54,6 +63,7 @@ alloc_client(int fd, time_t time)
> client_list[i].fd = fd;
> client_list[i].time = time;
> client_list[i].magic_close = 0;
> + client_list[i].warning_state = NONE;
> return &client_list[i];
> }
> }
> @@ -244,6 +254,22 @@ main(void)
> time_t ctime = time(NULL);
> for (i = 0; i < MAX_CLIENTS; i++) {
> if (client_list[i].fd != 0 && client_list[i].time != 0) {
> + if (
> + client_list[i].warning_state == WARNING_ISSUED
> + && (ctime - client_list[i].time) <= CLIENT_WATCHDOG_TIMEOUT_WARNING
> + ) {
> + client_list[i].warning_state = CRISIS_AVERTED;
> + fprintf(stderr, "phew, client watchdog was updated before expiring\n");
> + }
> +
> + if (
> + client_list[i].warning_state != WARNING_ISSUED
> + && (ctime - client_list[i].time) > CLIENT_WATCHDOG_TIMEOUT_WARNING
> + ) {
> + client_list[i].warning_state = WARNING_ISSUED;
> + fprintf(stderr, "client watchdog is about to expire\n");
> + }
> +
> if ((ctime - client_list[i].time) > client_watchdog_timeout) {
> update_watchdog = 0;
> fprintf(stderr, "client watchdog expired - disable watchdog updates\n");
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring
2025-05-19 13:09 [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Maximiliano Sandoval
` (2 preceding siblings ...)
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 3/3] watchdog: sync journal after sending expiration related messages Maximiliano Sandoval
@ 2025-06-16 8:40 ` Aaron Lauterer
3 siblings, 0 replies; 8+ messages in thread
From: Aaron Lauterer @ 2025-06-16 8:40 UTC (permalink / raw)
To: Proxmox VE development discussion, Maximiliano Sandoval
tested it by applying this series to a node with HA guests and then
disabling the corosync network completely or, to test the "averted" log,
sleeping for 45 seconds before bringing the corosync network back up.
So far, it seems that the "about to expire" warning did make it into the
journal in my tests.
We will see in the future, how well that will work in production
systems, depending on the underlying storage layer.
Some smaller remarks on patch 2/3.
Considers this series:
Tested-By: Aaron Lauterer <a.lauterer@proxmox.com>
Reviewed-By: Aaron Lauterer <a.lauterer@proxmox.com>
On 2025-05-19 15:09, Maximiliano Sandoval wrote:
> It is very hard to provide a definitive answer to whether a host fenced or not.
> In some cases the journal on the disk can be missing up to 2 minutes since its
> last logged entry and the time where another node detects the corosync link is
> down, with such a gap, the fenced node would not even record that it lost
> conenction and it is not possible to fully-determine if the node was fenced or
> not.
>
> This series:
> - adds a second warning 10 seconds before the watchdog expires
> - syncs the journal to disk after the warning was issued
> - syncs the journal to disk after the watchdog expires
>
> The variable names in the second commit could use some feedback. The way the
> warning timeout is defined was arbitrary (10 seconds before the fence).
>
> Maximiliano Sandoval (3):
> watchdog: separate if in two parts
> watchdog: warn when about to expire
> watchdog: sync journal after sending expiration related messages
>
> src/watchdog-mux.c | 40 +++++++++++++++++++++++++++++++++-------
> 1 file changed, 33 insertions(+), 7 deletions(-)
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire Maximiliano Sandoval
2025-06-16 8:37 ` Aaron Lauterer
@ 2025-06-17 6:11 ` Thomas Lamprecht
1 sibling, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2025-06-17 6:11 UTC (permalink / raw)
To: Proxmox VE development discussion, Maximiliano Sandoval
Am 19.05.25 um 15:09 schrieb Maximiliano Sandoval:
> Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> ---
> src/watchdog-mux.c | 26 ++++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/src/watchdog-mux.c b/src/watchdog-mux.c
> index a9017b3..e14c768 100644
> --- a/src/watchdog-mux.c
> +++ b/src/watchdog-mux.c
> @@ -29,15 +29,24 @@
>
> #define JOURNALCTL_BIN "/bin/journalctl"
>
> +#define CLIENT_WATCHDOG_TIMEOUT_WARNING 50
> +
> int watchdog_fd = -1;
> int watchdog_timeout = 10;
> int client_watchdog_timeout = 60;
> int update_watchdog = 1;
>
> +enum warning_state_t {
> + NONE,
> + WARNING_ISSUED,
> + CRISIS_AVERTED,
> +};
> +
> typedef struct {
> int fd;
> time_t time;
> int magic_close;
> + enum warning_state_t warning_state;
> } wd_client_t;
>
> #define MAX_CLIENTS 100
> @@ -54,6 +63,7 @@ alloc_client(int fd, time_t time)
> client_list[i].fd = fd;
> client_list[i].time = time;
> client_list[i].magic_close = 0;
> + client_list[i].warning_state = NONE;
> return &client_list[i];
> }
> }
> @@ -244,6 +254,22 @@ main(void)
> time_t ctime = time(NULL);
> for (i = 0; i < MAX_CLIENTS; i++) {
> if (client_list[i].fd != 0 && client_list[i].time != 0) {
> + if (
> + client_list[i].warning_state == WARNING_ISSUED
> + && (ctime - client_list[i].time) <= CLIENT_WATCHDOG_TIMEOUT_WARNING
> + ) {
> + client_list[i].warning_state = CRISIS_AVERTED;
> + fprintf(stderr, "phew, client watchdog was updated before expiring\n");
please no "phew" in the logs, this is a serious topic were being
factual and bland is definitively preferred.
> + }
> +
> + if (
> + client_list[i].warning_state != WARNING_ISSUED
> + && (ctime - client_list[i].time) > CLIENT_WATCHDOG_TIMEOUT_WARNING
> + ) {
> + client_list[i].warning_state = WARNING_ISSUED;
> + fprintf(stderr, "client watchdog is about to expire\n");
would be good to include the time left here in the message, allows
admins/support a more informed picture, especially if the timeouts
ever change.
> + }
> +
> if ((ctime - client_list[i].time) > client_watchdog_timeout) {
> update_watchdog = 0;
> fprintf(stderr, "client watchdog expired - disable watchdog updates\n");
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH ha-manager 3/3] watchdog: sync journal after sending expiration related messages
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 3/3] watchdog: sync journal after sending expiration related messages Maximiliano Sandoval
@ 2025-06-17 6:21 ` Thomas Lamprecht
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2025-06-17 6:21 UTC (permalink / raw)
To: Proxmox VE development discussion, Maximiliano Sandoval
Am 19.05.25 um 15:09 schrieb Maximiliano Sandoval:
> One sync comes after warning that the watchdog is about to expire, and a
> second right after the watchdog expires.
>
> To maximize the chances the log will contain entries relevant to a fence
> event. This would be extremely useful for detecting whether a node
> fenced.
>
> Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
> ---
> src/watchdog-mux.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/src/watchdog-mux.c b/src/watchdog-mux.c
> index e14c768..8669b10 100644
> --- a/src/watchdog-mux.c
> +++ b/src/watchdog-mux.c
> @@ -268,11 +268,13 @@ main(void)
> ) {
> client_list[i].warning_state = WARNING_ISSUED;
> fprintf(stderr, "client watchdog is about to expire\n");
> + sync_journal_unsafe();
The "unsafe" is there for a reason, on a loaded machine doing above
might trigger a few times and create a zombie left over process for
each of those.
Simplest fix might be doing a double fork there so that the parent
process does not exist anymore, in which case systemd collects the
child process exit status, albeit that wouldn't be the most efficient
solution.
> }
>
> if ((ctime - client_list[i].time) > client_watchdog_timeout) {
> update_watchdog = 0;
> fprintf(stderr, "client watchdog expired - disable watchdog updates\n");
> + sync_journal_unsafe();
This is basically useless compared to the status quo, there is already
such a call a few (compiled) instructions after that branch hits anyway
as we break the main loop then.
> }
> }
> }
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-06-17 6:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-19 13:09 [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Maximiliano Sandoval
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 1/3] watchdog: separate if in two parts Maximiliano Sandoval
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 2/3] watchdog: warn when about to expire Maximiliano Sandoval
2025-06-16 8:37 ` Aaron Lauterer
2025-06-17 6:11 ` Thomas Lamprecht
2025-05-19 13:09 ` [pve-devel] [PATCH ha-manager 3/3] watchdog: sync journal after sending expiration related messages Maximiliano Sandoval
2025-06-17 6:21 ` Thomas Lamprecht
2025-06-16 8:40 ` [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk before and after expiring Aaron Lauterer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal