From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9])
	by lore.proxmox.com (Postfix) with ESMTPS id DF4521FF17F
	for <inbox@lore.proxmox.com>; Mon, 19 May 2025 15:09:53 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id EB546AAC4;
	Mon, 19 May 2025 15:09:40 +0200 (CEST)
From: Maximiliano Sandoval <m.sandoval@proxmox.com>
To: pve-devel@lists.proxmox.com
Date: Mon, 19 May 2025 15:09:32 +0200
Message-Id: <20250519130935.365142-1-m.sandoval@proxmox.com>
X-Mailer: git-send-email 2.39.5
MIME-Version: 1.0
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.094 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: [pve-devel] [PATCH ha-manager 0/3] watchdog: sync log to disk
 before and after expiring
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

It is very hard to provide a definitive answer to whether a host fenced or not.
In some cases the journal on the disk can be missing up to 2 minutes since its
last logged entry and the time where another node detects the corosync link is
down, with such a gap, the fenced node would not even record that it lost
conenction and it is not possible to fully-determine if the node was fenced or
not.

This series:
 - adds a second warning 10 seconds before the watchdog expires
 - syncs the journal to disk after the warning was issued
 - syncs the journal to disk after the watchdog expires

The variable names in the second commit could use some feedback. The way the
warning timeout is defined was arbitrary (10 seconds before the fence).

Maximiliano Sandoval (3):
  watchdog: separate if in two parts
  watchdog: warn when about to expire
  watchdog: sync journal after sending expiration related messages

 src/watchdog-mux.c | 40 +++++++++++++++++++++++++++++++++-------
 1 file changed, 33 insertions(+), 7 deletions(-)

-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel