From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B19E670B99 for ; Mon, 9 May 2022 13:51:38 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id A89512C7A2 for ; Mon, 9 May 2022 13:51:08 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 0D63D2C799 for ; Mon, 9 May 2022 13:51:08 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D0E754323A; Mon, 9 May 2022 13:51:07 +0200 (CEST) Message-ID: <51776875-75fe-74ae-4f42-3ed18fd0f144@proxmox.com> Date: Mon, 9 May 2022 13:51:07 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:101.0) Gecko/20100101 Thunderbird/101.0 Content-Language: en-US To: Thomas Lamprecht , Proxmox Backup Server development discussion References: <20220509104030.1943794-1-d.csapak@proxmox.com> <32864ef5-48b2-38be-80ee-4ba43adbd0a9@proxmox.com> From: Dominik Csapak In-Reply-To: <32864ef5-48b2-38be-80ee-4ba43adbd0a9@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.869 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -1.498 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pbs-devel] [PATCH proxmox-backup] chunk_store: insert_chunk: write chunk again if sizes don't match X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 May 2022 11:51:38 -0000 On 5/9/22 13:34, Thomas Lamprecht wrote: > On 09/05/2022 12:40, Dominik Csapak wrote: >> if the on-disk size of a chunk is not correct, write it again when >> inserting and log a warning. >> >> This is currently possible if PBS crashes, but the rename of the chunk >> was flushed to disk, when the actual data was not. > > could be also interesting to note here that we basically got all data > required to do that already anyway. what exactly do you mean here? just adding 'since we already have the complete chunk data, we are able to overwrite it' (or similar?) > > And I'd think that a verify would catch this too and rename it to .bad, albeit > that can naturally be to late if one is unlucky, so still good to do. yes a verify will trigger that, but as you said, that can be too late ;) also any input on @fabians suggestion to bail out when the old_size != 0 but != new_size? > > small nit inline (can be probably just fixed up on apply) > >> >> Suggested-by: Fabian Grünbichler >> Signed-off-by: Dominik Csapak >> --- >> pbs-datastore/src/chunk_store.rs | 23 ++++++++++++++++------- >> 1 file changed, 16 insertions(+), 7 deletions(-) >> >> diff --git a/pbs-datastore/src/chunk_store.rs b/pbs-datastore/src/chunk_store.rs >> index 8d7df513..93f56e8b 100644 >> --- a/pbs-datastore/src/chunk_store.rs >> +++ b/pbs-datastore/src/chunk_store.rs >> @@ -458,17 +458,29 @@ impl ChunkStore { >> >> let lock = self.mutex.lock(); >> >> + let raw_data = chunk.raw_data(); >> + let encoded_size = raw_data.len() as u64 >> + >> if let Ok(metadata) = std::fs::metadata(&chunk_path) { >> - if metadata.is_file() { >> - self.touch_chunk(digest)?; >> - return Ok((true, metadata.len())); >> - } else { >> + if !metadata.is_file() { >> bail!( >> "Got unexpected file type on store '{}' for chunk {}", >> self.name, >> digest_str >> ); >> } >> + let new_len = metadata.len(); >> + if encoded_size == new_len { >> + self.touch_chunk(digest)?; >> + return Ok((true, new_len)); >> + } else { >> + log::warn!( >> + "chunk size mismatch on insert for {}: old {} - new {}", > > fyi: you can now use variable names directly in format strings: > > "chunk size mismatch on insert for {digest_str}: old {encoded_size} - new {new_len}", > > nit: why is one named a "size" and one a "len", if they're the same thing it'd be > nice to have it consistent. > argh... 'new_len' is actually the old one. so it'll be 'encoded_size' and 'old_size'.. thanks for making me look again ^^ >> + digest_str, >> + encoded_size, >> + new_len >> + ); >> + } >> } >> >> let mut tmp_path = chunk_path.clone(); >> @@ -483,9 +495,6 @@ impl ChunkStore { >> ) >> })?; >> >> - let raw_data = chunk.raw_data(); >> - let encoded_size = raw_data.len() as u64; >> - >> file.write_all(raw_data).map_err(|err| { >> format_err!( >> "writing temporary chunk on store '{}' failed for {} - {}", >