From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 0300162653 for ; Wed, 30 Sep 2020 13:22:30 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E35D418576 for ; Wed, 30 Sep 2020 13:21:59 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 99E371856A for ; Wed, 30 Sep 2020 13:21:58 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 622EB4592F; Wed, 30 Sep 2020 13:21:58 +0200 (CEST) From: =?UTF-8?q?Fabian=20Gr=C3=BCnbichler?= To: pve-devel@lists.proxmox.com Date: Wed, 30 Sep 2020 13:21:31 +0200 Message-Id: <20200930112131.2044392-1-f.gruenbichler@proxmox.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.037 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH cluster] pmxcfs: protect CPG operations with mutex X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Sep 2020 11:22:30 -0000 cpg_mcast_joined (and transitively, cpg_join/leave) are not thread-safe. pmxcfs triggers such operations via FUSE and CPG dispatch callbacks, which are running in concurrent threads. accordingly, we need to protect these operations with a mutex, otherwise they might return CS_OK without actually doing what they were supposed to do (which in turn can lead to the dfsm taking a wrong turn and getting stuck in a supposedly short-lived state, blocking access via FUSE and getting whole clusters fenced). huge thanks to Alexandre Derumier for providing the initial bug report and quite a lot of test runs while debugging this issue. Signed-off-by: Fabian Grünbichler --- Notes: we could recycle sync_mutex, but that makes it harder to reason about securing all code paths. it also protects non CPG operations as part of the sync messsage queue handling, so mixing those up is non-ideal. @Alexandre: this is a slightly different approach compared to the test build from yesterday, so if you want to test this as well it would be very welcome :) data/src/dfsm.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/data/src/dfsm.c b/data/src/dfsm.c index 172d877..17a3ba4 100644 --- a/data/src/dfsm.c +++ b/data/src/dfsm.c @@ -107,6 +107,7 @@ struct dfsm { cpg_callbacks_t *cpg_callbacks; dfsm_callbacks_t *dfsm_callbacks; cpg_handle_t cpg_handle; + GMutex cpg_mutex; struct cpg_name cpg_group_name; uint32_t nodeid; uint32_t pid; @@ -204,7 +205,9 @@ dfsm_send_message_full( cs_error_t result; int retries = 0; loop: + g_mutex_lock (&dfsm->cpg_mutex); result = cpg_mcast_joined(dfsm->cpg_handle, CPG_TYPE_AGREED, iov, len); + g_mutex_unlock (&dfsm->cpg_mutex); if (retry && result == CS_ERR_TRY_AGAIN) { nanosleep(&tvreq, NULL); ++retries; @@ -1250,7 +1253,9 @@ dfsm_new( if (!(dfsm->msg_queue = g_sequence_new(NULL))) goto err; - + + g_mutex_init(&dfsm->cpg_mutex); + dfsm->log_domain = log_domain; dfsm->data = data; dfsm->mode = DFSM_MODE_START; @@ -1424,7 +1429,9 @@ dfsm_join(dfsm_t *dfsm) struct timespec tvreq = { .tv_sec = 0, .tv_nsec = 100000000 }; int retries = 0; loop: + g_mutex_lock (&dfsm->cpg_mutex); result = cpg_join(dfsm->cpg_handle, &dfsm->cpg_group_name); + g_mutex_unlock (&dfsm->cpg_mutex); if (result == CS_ERR_TRY_AGAIN) { nanosleep(&tvreq, NULL); ++retries; @@ -1453,7 +1460,9 @@ dfsm_leave (dfsm_t *dfsm) struct timespec tvreq = { .tv_sec = 0, .tv_nsec = 100000000 }; int retries = 0; loop: + g_mutex_lock (&dfsm->cpg_mutex); result = cpg_leave(dfsm->cpg_handle, &dfsm->cpg_group_name); + g_mutex_unlock (&dfsm->cpg_mutex); if (result == CS_ERR_TRY_AGAIN) { nanosleep(&tvreq, NULL); ++retries; @@ -1509,6 +1518,8 @@ dfsm_destroy(dfsm_t *dfsm) g_mutex_clear (&dfsm->sync_mutex); g_cond_clear (&dfsm->sync_cond); + + g_mutex_clear (&dfsm->cpg_mutex); if (dfsm->results) g_hash_table_destroy(dfsm->results); -- 2.20.1