From: Friedrich Weber <f.weber@proxmox.com>
To: pmg-devel@lists.proxmox.com
Subject: [pmg-devel] [PATCH v2 pmg-api] fix #5189: cluster: avoid sync errors for statistics and quarantine
Date: Thu, 22 Feb 2024 17:35:35 +0100 [thread overview]
Message-ID: <20240222163535.1112846-1-f.weber@proxmox.com> (raw)
After restoring a backup from a cluster on a fresh node with
statistics, and then creating a cluster, the following can happen
(node 1 being master and node 2 being a node): `ClusterInfo` on node
1 has no record about the last-synchronized `CStatistic` row id of
node 2. Thus, pmgmirror on node 1 initializes the record with -1 and
tries to synchronize *all* `CStatistic` rows with cid 2 from node 2.
But (some of) these rows may already exist on cid 1, because they
were part of the backup, so pmgmirror on node 1 triggers a Postgres
unique constraint violation, statistics synchronization on node 1
fails, and node 1 remains in the "synchronizing" state.
Fix this as follows: When a new node is added to a cluster, the master
now initializes its `ClusterInfo` record of the last-synchronized
`CStatistic` row id for that node cid with the maximum row id that
exists in the local `CStatistic` for that node cid, or with -1 if the
local `CStatistic` has no row for that node cid. This is valid because
the newly-added node copies the master's `CStatistic` table during
cluster join.
Do the same for the `CMailStore` table, where a similar sync error
could happen e.g. if the table has rows for both node cids, node 2 is
shut down and manually deleted from the cluster.conf, the maxcid is
manually reset to 1, and a fresh node is joined to the cluster and
gets assigned cid 2.
Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
---
Notes:
v1 -> v2:
- initialize master `ClusterInfo` record for `CStatistic` once when
joining a new node instead of initializing on the first cluster
sync, as discussed with Stoiko off-list
- do the same for `CMailStore`, where a similar sync error can happen
As a side effect of this change, `ClusterInfo` on node 2 will now also
have `lastid_CStatistic` and `lastid_CMailStore` rows for its own cid
2 after join, because the database is copied over *after* the master
runs `update_master_clusterinfo`, and `update_client_clusterinfo` on
node 2 only deletes rows with the master cid 1. However, if I
understand the cluster sync code correctly (and that's a big if),
these additional rows should not have any effect, as node 2 will
only ever read and update `ClusterInfo` rows for other cids, never
for its own cid.
v1: https://lists.proxmox.com/pipermail/pmg-devel/2024-January/002657.html
src/PMG/DBTools.pm | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/src/PMG/DBTools.pm b/src/PMG/DBTools.pm
index 6112566..8770d06 100644
--- a/src/PMG/DBTools.pm
+++ b/src/PMG/DBTools.pm
@@ -1132,6 +1132,14 @@ sub update_master_clusterinfo {
$dbh->do ("INSERT INTO ClusterInfo (cid, name, ivalue) select $clientcid, 'lastmt_$table', " .
"EXTRACT(EPOCH FROM now())::INTEGER");
}
+
+ my @lastid_tables = ('CStatistic', 'CMailStore');
+
+ for my $table (@lastid_tables) {
+ $dbh->do("INSERT INTO ClusterInfo (cid, name, ivalue) " .
+ "SELECT $clientcid, 'lastid_$table', COALESCE (max (rid), -1) FROM $table " .
+ "WHERE cid = $clientcid");
+ }
}
sub update_client_clusterinfo {
--
2.39.2
next reply other threads:[~2024-02-22 16:36 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-22 16:35 Friedrich Weber [this message]
2024-02-22 22:01 ` [pmg-devel] applied: " Stoiko Ivanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240222163535.1112846-1-f.weber@proxmox.com \
--to=f.weber@proxmox.com \
--cc=pmg-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.