From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 88E4260D5C for ; Fri, 25 Sep 2020 14:51:52 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 845561B5FB for ; Fri, 25 Sep 2020 14:51:52 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 632841B5DA for ; Fri, 25 Sep 2020 14:51:37 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2DFD145654 for ; Fri, 25 Sep 2020 14:51:37 +0200 (CEST) Date: Fri, 25 Sep 2020 14:51:30 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox VE development discussion References: <216436814.339545.1599142316781.JavaMail.zimbra@odiso.com> <7286111.1004215.1600753433567.JavaMail.zimbra@odiso.com> <1600955976.my09kyug7a.astroid@nora.none> <448071987.1233936.1600957757611.JavaMail.zimbra@odiso.com> <906463192.1239798.1600970863982.JavaMail.zimbra@odiso.com> <2019315449.1247737.1601016264202.JavaMail.zimbra@odiso.com> <1264529857.1248647.1601018149719.JavaMail.zimbra@odiso.com> <1601024991.2yoxd1np1v.astroid@nora.none> <1157671072.1253096.1601027205997.JavaMail.zimbra@odiso.com> In-Reply-To: <1157671072.1253096.1601027205997.JavaMail.zimbra@odiso.com> MIME-Version: 1.0 User-Agent: astroid/0.15.0 (https://github.com/astroidmail/astroid) Message-Id: <1601037918.lwca57m6tz.astroid@nora.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL -1.253 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_LOTSOFHASH 0.25 Emails with lots of hash-like gibberish SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record T_SPF_TEMPERROR 0.01 SPF: test of record failed (temperror) Subject: Re: [pve-devel] corosync bug: cluster break after 1 node clean shutdown X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Sep 2020 12:51:52 -0000 On September 25, 2020 11:46 am, Alexandre DERUMIER wrote: >>>I will rebuild once more modifying the send code a bit to log a lot more >>>details when sending state messages, it would be great if you could >>>repeat with that as we are still unable to reproduce the issue. >=20 > ok, no problem, I'm able to easily reproduce it, I'll do new test when yo= u'll > send the new version. >=20 > (and thanks again to debugging this, because It's really beyond my compet= ence) same procedure as last time, same place, new checksums: 6b5e2defe543a874e0f7a883e40b279c997438dde158566c4c93c11ea531aef924ed2e4eb25= 06b5064e49ec9bdd4ebe7acd0b9278e9286eac527b0b15a43d8d7 pve-cluster_6.1-8_am= d64.deb d57ddc08824055826ee15c9c255690d9140e43f8d5164949108f0dc483a2d181b2bda76f0e7= f47202699a062342c0cf0bba8f2ae0f7c5411af9967ef051050a0 pve-cluster-dbgsym_6= .1-8_amd64.deb I found one (unfortunately unrelated) bug in our error handling, so=20 there's that at least ;) diff --git a/data/src/dfsm.c b/data/src/dfsm.c index 529c7f9..f3397a0 100644 --- a/data/src/dfsm.c +++ b/data/src/dfsm.c @@ -162,8 +162,8 @@ static void dfsm_send_sync_message_abort(dfsm_t *dfsm) { g_return_if_fail(dfsm !=3D NULL); - g_mutex_lock (&dfsm->sync_mutex); + cfs_dom_debug(dfsm->log_domain, "dfsm_send_sync_message_abort - %" PRIu64= " / %" PRIu64, dfsm->msgcount_rcvd, dfsm->msgcount); dfsm->msgcount_rcvd =3D dfsm->msgcount; g_cond_broadcast (&dfsm->sync_cond); g_mutex_unlock (&dfsm->sync_mutex); @@ -181,6 +181,7 @@ dfsm_record_local_result( =20 g_mutex_lock (&dfsm->sync_mutex); dfsm_result_t *rp =3D (dfsm_result_t *)g_hash_table_lookup(dfsm->results,= &msg_count); + cfs_dom_debug(dfsm->log_domain, "dfsm_record_local_result - %" PRIu64": = %d", msg_count, msg_result); if (rp) { rp->result =3D msg_result; rp->processed =3D processed; @@ -224,6 +225,48 @@ loop: return result; } =20 +static cs_error_t=20 +dfsm_send_message_full_debug_state( + dfsm_t *dfsm, + struct iovec *iov,=20 + unsigned int len, + int retry) +{ + g_return_val_if_fail(dfsm !=3D NULL, CS_ERR_INVALID_PARAM); + g_return_val_if_fail(!len || iov !=3D NULL, CS_ERR_INVALID_PARAM); + + struct timespec tvreq =3D { .tv_sec =3D 0, .tv_nsec =3D 100000000 }; + cs_error_t result; + int retries =3D 0; + cfs_dom_message(dfsm->log_domain, "send state message debug"); + cfs_dom_message(dfsm->log_domain, "cpg_handle %" PRIu64, dfsm->cpg_handle= ); + for (int i =3D 0; i < len; i++) + cfs_dom_message(dfsm->log_domain, "iov[%d] len %zd", i, iov[i].iov_len); +loop: + cfs_dom_message(dfsm->log_domain, "send state message loop body"); + + result =3D cpg_mcast_joined(dfsm->cpg_handle, CPG_TYPE_AGREED, iov, len); + + cfs_dom_message(dfsm->log_domain, "send state message result: %d", result= ); + if (retry && result =3D=3D CS_ERR_TRY_AGAIN) { + nanosleep(&tvreq, NULL); + ++retries; + if ((retries % 10) =3D=3D 0) + cfs_dom_message(dfsm->log_domain, "cpg_send_message retry %d", retries)= ; + if (retries < 100) + goto loop; + } + + if (retries) + cfs_dom_message(dfsm->log_domain, "cpg_send_message retried %d times", r= etries); + + if (result !=3D CS_OK && + (!retry || result !=3D CS_ERR_TRY_AGAIN)) + cfs_dom_critical(dfsm->log_domain, "cpg_send_message failed: %d", result= ); + + return result; +} + static cs_error_t=20 dfsm_send_state_message_full( dfsm_t *dfsm, @@ -235,6 +278,8 @@ dfsm_send_state_message_full( g_return_val_if_fail(DFSM_VALID_STATE_MESSAGE(type), CS_ERR_INVALID_PARAM= ); g_return_val_if_fail(!len || iov !=3D NULL, CS_ERR_INVALID_PARAM); =20 + cfs_dom_debug(dfsm->log_domain, "dfsm_send_state_message_full: type %d le= n %d", type, len); + dfsm_message_state_header_t header; header.base.type =3D type; header.base.subtype =3D 0; @@ -252,7 +297,7 @@ dfsm_send_state_message_full( for (int i =3D 0; i < len; i++) real_iov[i + 1] =3D iov[i]; =20 - return dfsm_send_message_full(dfsm, real_iov, len + 1, 1); + return type =3D=3D DFSM_MESSAGE_STATE ? dfsm_send_message_full_debug_stat= e(dfsm, real_iov, len + 1, 1) : dfsm_send_message_full(dfsm, real_iov, len = + 1, 1); } =20 cs_error_t=20 @@ -317,6 +362,7 @@ dfsm_send_message_sync( for (int i =3D 0; i < len; i++) real_iov[i + 1] =3D iov[i]; =20 + cfs_dom_debug(dfsm->log_domain, "dfsm_send_messag_sync: type NORMAL, coun= t %" PRIu64 ", msgtype %d, len %d", msgcount, msgtype, len); cs_error_t result =3D dfsm_send_message_full(dfsm, real_iov, len + 1, 1); =20 g_mutex_unlock (&dfsm->sync_mutex); @@ -335,10 +381,12 @@ dfsm_send_message_sync( if (rp) { g_mutex_lock (&dfsm->sync_mutex); =20 - while (dfsm->msgcount_rcvd < msgcount) + while (dfsm->msgcount_rcvd < msgcount) { + cfs_dom_debug(dfsm->log_domain, "dfsm_send_message_sync: waiting for re= ceived messages %" PRIu64 " / %" PRIu64, dfsm->msgcount_rcvd, msgcount); g_cond_wait (&dfsm->sync_cond, &dfsm->sync_mutex); + } + cfs_dom_debug(dfsm->log_domain, "dfsm_send_message_sync: done waiting fo= r received messages!"); =20 - =20 g_hash_table_remove(dfsm->results, &rp->msgcount); =09 g_mutex_unlock (&dfsm->sync_mutex); @@ -685,9 +733,13 @@ dfsm_cpg_deliver_callback( return; } =20 + cfs_dom_debug(dfsm->log_domain, "received message's header type is %d", b= ase_header->type); + if (base_header->type =3D=3D DFSM_MESSAGE_NORMAL) { =20 dfsm_message_normal_header_t *header =3D (dfsm_message_normal_header_t *= )msg; + cfs_dom_debug(dfsm->log_domain, "received normal message (type =3D %d, s= ubtype =3D %d, %zd bytes)", + base_header->type, base_header->subtype, msg_len); =20 if (msg_len < sizeof(dfsm_message_normal_header_t)) { cfs_dom_critical(dfsm->log_domain, "received short message (type =3D %d= , subtype =3D %d, %zd bytes)", @@ -704,6 +756,8 @@ dfsm_cpg_deliver_callback( } else { =20 int msg_res =3D -1; + cfs_dom_debug(dfsm->log_domain, "deliver message %" PRIu64 " (subtype = =3D %d, length =3D %zd)", + header->count, base_header->subtype, msg_len);=20 int res =3D dfsm->dfsm_callbacks->dfsm_deliver_fn( dfsm, dfsm->data, &msg_res, nodeid, pid, base_header->subtype,=20 base_header->time, (uint8_t *)msg + sizeof(dfsm_message_normal_header_= t), @@ -724,6 +778,8 @@ dfsm_cpg_deliver_callback( */ =20 dfsm_message_state_header_t *header =3D (dfsm_message_state_header_t *)ms= g; + cfs_dom_debug(dfsm->log_domain, "received state message (type =3D %d, sub= type =3D %d, %zd bytes), mode is %d", + base_header->type, base_header->subtype, msg_len, mode); =20 if (msg_len < sizeof(dfsm_message_state_header_t)) { cfs_dom_critical(dfsm->log_domain, "received short state message (type = =3D %d, subtype =3D %d, %zd bytes)", @@ -744,6 +800,7 @@ dfsm_cpg_deliver_callback( =20 if (mode =3D=3D DFSM_MODE_SYNCED) { if (base_header->type =3D=3D DFSM_MESSAGE_UPDATE_COMPLETE) { + cfs_dom_debug(dfsm->log_domain, "received update complete message"); =20 for (int i =3D 0; i < dfsm->sync_info->node_count; i++) dfsm->sync_info->nodes[i].synced =3D 1; @@ -754,6 +811,7 @@ dfsm_cpg_deliver_callback( return; =20 } else if (base_header->type =3D=3D DFSM_MESSAGE_VERIFY_REQUEST) { + cfs_dom_debug(dfsm->log_domain, "received verify request message"); =20 if (msg_len !=3D sizeof(dfsm->csum_counter)) { cfs_dom_critical(dfsm->log_domain, "cpg received verify request with w= rong length (%zd bytes) form node %d/%d", msg_len, nodeid, pid); @@ -823,7 +881,6 @@ dfsm_cpg_deliver_callback( } else if (mode =3D=3D DFSM_MODE_START_SYNC) { =20 if (base_header->type =3D=3D DFSM_MESSAGE_SYNC_START) { - if (nodeid !=3D dfsm->lowest_nodeid) { cfs_dom_critical(dfsm->log_domain, "ignore sync request from wrong mem= ber %d/%d", nodeid, pid); @@ -861,6 +918,7 @@ dfsm_cpg_deliver_callback( return; =20 } else if (base_header->type =3D=3D DFSM_MESSAGE_STATE) { + cfs_dom_debug(dfsm->log_domain, "received state message for %d/%d", nod= eid, pid); =20 dfsm_node_info_t *ni; =09 @@ -906,6 +964,8 @@ dfsm_cpg_deliver_callback( goto leave; } =20 + } else { + cfs_dom_debug(dfsm->log_domain, "haven't received all states, waiting = for more"); } =20 return; @@ -914,6 +974,7 @@ dfsm_cpg_deliver_callback( } else if (mode =3D=3D DFSM_MODE_UPDATE) { =20 if (base_header->type =3D=3D DFSM_MESSAGE_UPDATE) { + cfs_dom_debug(dfsm->log_domain, "received update message"); =09 int res =3D dfsm->dfsm_callbacks->dfsm_process_update_fn( dfsm, dfsm->data, dfsm->sync_info, nodeid, pid, msg, msg_len); @@ -925,6 +986,7 @@ dfsm_cpg_deliver_callback( =20 } else if (base_header->type =3D=3D DFSM_MESSAGE_UPDATE_COMPLETE) { =20 + cfs_dom_debug(dfsm->log_domain, "received update complete message"); =20 int res =3D dfsm->dfsm_callbacks->dfsm_commit_fn(dfsm, dfsm->data, dfsm= ->sync_info); =20 @@ -1104,6 +1166,7 @@ dfsm_cpg_confchg_callback( size_t joined_list_entries) { cs_error_t result; + cfs_debug("dfsm_cpg_confchg_callback called"); =20 dfsm_t *dfsm =3D NULL; result =3D cpg_context_get(handle, (gpointer *)&dfsm); @@ -1190,7 +1253,7 @@ dfsm_cpg_confchg_callback( =20 dfsm_set_mode(dfsm, DFSM_MODE_START_SYNC); if (lowest_nodeid =3D=3D dfsm->nodeid) { - if (!dfsm_send_state_message_full(dfsm, DFSM_MESSAGE_SYNC_START, NULL, = 0)) { + if (dfsm_send_state_message_full(dfsm, DFSM_MESSAGE_SYNC_START, NULL, 0= ) !=3D CS_OK) { cfs_dom_critical(dfsm->log_domain, "failed to send SYNC_START message"= ); goto leave; } =