From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 1C60884830 for ; Tue, 14 Dec 2021 11:19:24 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id E32382265C for ; Tue, 14 Dec 2021 11:19:23 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 6E18822582 for ; Tue, 14 Dec 2021 11:19:22 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 40E7945003 for ; Tue, 14 Dec 2021 11:19:22 +0100 (CET) From: =?UTF-8?q?Fabian=20Gr=C3=BCnbichler?= To: pve-devel@lists.proxmox.com Date: Tue, 14 Dec 2021 11:19:12 +0100 Message-Id: <20211214101912.3509825-2-f.gruenbichler@proxmox.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211214101912.3509825-1-f.gruenbichler@proxmox.com> References: <20211214101912.3509825-1-f.gruenbichler@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.250 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH cluster 2/2] clusterlog: fix segfault / wrong iteration bounds X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Dec 2021 10:19:24 -0000 the clusterlog struct is a basic ring buffer: struct clog_base { uint32_t size; // total size of this clog_base uint32_t cpos; // index into data, starts counting at start of clog_base, initially 0 char data[]; }; an entry consists of indices of the next and previous entries and various fields (fixed-length ones omitted here): typedef struct { uint32_t prev; // index of previous entry, or 0 if none exists uint32_t next; // index of next entry [..] // fixed-length fields uint8_t node_len; uint8_t ident_len; uint8_t tag_len; uint32_t msg_len; char data[]; // node+ident+tag+msg - variable-length fields } clog_entry_t; the next and prev indices are calculated when allocating a new entry, and the position of the current entry 'cpos' is updated accordingly (clog_alloc_entry): - size of the entry is padded with up to 7 bytes - first entry goes to index 8 - second and subsequent entries go to the current entry's 'next' index - if the current entry's 'next' index is out of bonds, the first entry is overwritten => wrap-around - the 'prev' index of the new entry is set to cpos - cpos is set to the index of the new entry - the 'next' index of the new entry is set to its index+padded size when iterating over the entries, the following bounds are used to follow the 'prev' links starting at the current entry: while (cpos && (cpos <= clog->cpos || cpos > (clog->cpos + CLOG_MAX_ENTRY_SIZE))) { while this handles a not-yet-wrapped around ring buffer (cpos would be 0 when reaching the first entry), and tries to handle wrap-arounds by terminating when reaching a 'red-zone' of 'CLOG_MAX_ENTRY_SIZE' starting at the current entry (this covers the current entry which was already visited as first entry during the iteration, and the next entry after it which might have been overwritten) - but it's possible that entries line up so that the wrap-around 'prev' index of the first entry points to a location *before* the current entry. for example, looking at clog_base with S being the size field, C being the cpos field, followed by the actual data. N/P are the next/prev indices of the entry at C, Q denotes the 'prev' index of the first entry in the data array, and 'R' the red zone used for the loop check in case of wrap-around. first, fill up the buffer with six large entries: Q P C N | | | | | | | | v v v v +-+-+------+------+------+------+------+------+-+ | | | | | | | | |x| | | | 1 | 2 | 3 | 4 | 5 | 6 |x| | | | | | | | | |x| +-+-+------+------+------+------+------+------+-+ S C RRRRRRRRRRR iterating from C backwards ends up at Q being 0, terminating the loop without a wrap-around after having visit 6->1 now the next (in this example, smaller) entry that gets allocated/insert needs to wrap around, because the empty space at the end (denoted by XXX) is too small: C N QP | | || | | || v v vv +-+-+------+------+------+------+------+------+-+ | | | | | | | | |x| | | | 7 | 2 | 3 | 4 | 5 | 6 |x| | | | | | | | | |x| +-+-+------+------+------+------+------+------+-+ S C RRRRRRRRRRR iterating backwards from C terminates the loop when reaching the red zone, with the (second) entry no longer being considered since it partly overlaps it. only 7->3 are visited. adding more entries we end up with the following layout: P QC N | || | | || | v vv v +-+-+------+---+---+---+---+---+---+---+---+--+-+ | | | | | | | | | | | |##|x| | | | 7 | 8 | 9 |10 |11 |12 |13 |14 |15 |#6|x| | | | | | | | | | | | |##|x| +-+-+------+---+---+---+---+---+---+---+---+--+-+ S C RRRRRRRRRRR with # denoting space previously occupied the last large entry (#6) which is still unmodified (the rest of that entry's data has been overwritten by entries #14 and #15). iterating from C (to the left/P) the loop ends up at entry #7, follows the link to Q (which satisfies the loop bounds as Q < C), and the data starting at (invalid index) Q gets interpreted as an entry. it is possible (though even more unlikely than the partial overwrite case) that Q and C line up perfectly, which would cause the loop to become an infinite loop. the loop *should* terminate after having visited 15-7, without wrapping around. note that the actual sizes of the entries are not relevant, the requirements are: - entry before last wrap-around must be big enough that entry of current index can overtake it without another wrap-around - method that does iteration must be called before next wrap-around the fix is obviously trivial once the issue became apparent - when wrapping around during iteration, additionally check that we are not jumping across the red zone into already invalidated parts of data. clusterlog_merge is technically not affected since it aborts before a wrap-around anyway, but it doesn't hurt to have the checks consistently in case this ever changes. thanks to @kev1904 on our community forums for reporting and providing the data to nail the cause down fast! Signed-off-by: Fabian Grünbichler --- data/src/logger.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/data/src/logger.c b/data/src/logger.c index 4cf9cce..619e1f6 100644 --- a/data/src/logger.c +++ b/data/src/logger.c @@ -136,6 +136,11 @@ clog_dump(clog_base_t *clog) while (cpos && (cpos <= clog->cpos || cpos > (clog->cpos + CLOG_MAX_ENTRY_SIZE))) { clog_entry_t *cur = (clog_entry_t *)((char *)clog + cpos); clog_dump_entry(cur, cpos); + + // wrap-around has to land after initial position + if (cpos < cur->prev && cur->prev <= clog->cpos) { + break; + } cpos = cur->prev; } } @@ -165,6 +170,11 @@ clog_dump_json( guint count = 0; while (cpos && (cpos <= clog->cpos || cpos > (clog->cpos + CLOG_MAX_ENTRY_SIZE))) { clog_entry_t *cur = (clog_entry_t *)((char *)clog + cpos); + + // wrap-around has to land after initial position + if (cpos < cur->prev && cur->prev <= clog->cpos) { + break; + } cpos = cur->prev; if (count >= max_entries) @@ -358,6 +368,11 @@ clog_sort(clog_base_t *clog) g_tree_insert(tree, cur, cur); + // wrap-around has to land after initial position + if (cpos < cur->prev && cur->prev <= clog->cpos) { + break; + } + cpos = cur->prev; } @@ -509,10 +524,12 @@ clusterlog_merge( break; } - if (!cur->prev) { + // no previous entry or wrap-around into already overwritten entry + if (!cur->prev || (cpos[found] < cur->prev && cur->prev <= clog[found]->cpos)) { cpos[found] = 0; } else { cpos[found] = cur->prev; + // wrap-around into current entry if (!(cpos[found] <= clog[found]->cpos || cpos[found] > (clog[found]->cpos + CLOG_MAX_ENTRY_SIZE))) { cpos[found] = 0; -- 2.30.2