From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 4F80B8580 for ; Mon, 21 Aug 2023 14:49:32 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 2EA04C6 for ; Mon, 21 Aug 2023 14:49:02 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 21 Aug 2023 14:49:01 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 2D7A642D34 for ; Mon, 21 Aug 2023 14:49:01 +0200 (CEST) Message-ID: Date: Mon, 21 Aug 2023 14:48:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Content-Language: en-US To: Wolfgang Bumiller Cc: pbs-devel@lists.proxmox.com References: <20230818143212.91113-1-g.goller@proxmox.com> <20230818143212.91113-2-g.goller@proxmox.com> From: Gabriel Goller In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 1.630 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -4.279 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH proxmox-backup v5] fix #4380: check if file is excluded before running `stat()` X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Aug 2023 12:49:32 -0000 On 8/21/23 14:18, Wolfgang Bumiller wrote: > On Mon, Aug 21, 2023 at 01:03:56PM +0200, Gabriel Goller wrote: >> On 8/21/23 10:41, Wolfgang Bumiller wrote: >>> needs a rebase, and comments inline: >>> >>> On Fri, Aug 18, 2023 at 04:32:12PM +0200, Gabriel Goller wrote: >>>> @@ -1108,7 +1108,7 @@ impl<'a> ExtractorState<'a> { >>>> async fn handle_new_directory( >>>> &mut self, >>>> entry: catalog::DirEntry, >>>> - match_result: Option, >>>> + match_result: Result, FileModeRequired>, >>> ^ This is not needed, let's not take errors with us throughout the >>> process. >>> >>> This is our catalog shell. The catalog file always contains file types >>> for its contents except for hardlinks, since we cannot resolve them. >>> However, hardlinks also cannot recurse down any further, since they're >>> also never directories, so `handle_new_directory` will never actually >>> get `Err()` passed here. >>> >>>> ) -> Result<(), Error> { >>>> // enter a new directory: >>>> self.read_dir_stack.push(mem::replace( >>>> @@ -1123,7 +1123,7 @@ impl<'a> ExtractorState<'a> { >>>> Shell::walk_pxar_archive(self.accessor, &mut self.dir_stack).await?; >>>> let dir_pxar = self.dir_stack.last().unwrap().pxar.as_ref().unwrap(); >>>> let dir_meta = dir_pxar.entry().metadata().clone(); >>>> - let create = self.matches && match_result != Some(MatchType::Exclude); >>>> + let create = self.matches && match_result != Ok(Some(MatchType::Exclude)); >>> ^ so no need for this change >>> >>>> self.extractor >>>> .enter_directory(dir_pxar.file_name().to_os_string(), dir_meta, create)?; >>>> @@ -1133,9 +1133,9 @@ impl<'a> ExtractorState<'a> { >>>> pub async fn handle_entry(&mut self, entry: catalog::DirEntry) -> Result<(), Error> { >>>> let match_result = self.match_list.matches(&self.path, entry.get_file_mode()); >>> I'd like to point out that if we don't know what to do, we should fail, >>> so, just use `?` here. >>> >>> Since this is an interactive tool, we *may* get away with just logging >>> this and continuing with the matches you wrote out below here. >>> >>> But we should not just be silent about it :-) >>> >>> Personally, I'm fine with erroring and telling the user to investigate. >>> In practice I doubt anyone will really run into this. The catalog shell >>> is too awkward to have many users (I think???) >> How about we introduce this log message, then use the question mark >> operator? > This would produce both a warning and an error message then, no? > Or did you mean something else? Oh, you are right, I missed that. We already log the error. >>     match (did_match, &entry.attr) { >>         (_, DirEntryAttribute::Directory { .. }) => { >>             if match_result.is_err() { >>                 log::error!("error while matching file paths"); >>             } >>             self.handle_new_directory(entry, match_result?).await?; >>         } >>         ... >>     } >>>> let did_match = match match_result { >>>> - Some(MatchType::Include) => true, >>>> - Some(MatchType::Exclude) => false, >>>> - None => self.matches, >>>> + Ok(Some(MatchType::Include)) => true, >>>> + Ok(Some(MatchType::Exclude)) => false, >>>> + _ => self.matches, >>>> }; >>>> match (did_match, &entry.attr) { >>>> diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs >>>> index 2577cf98..4a844599 100644 >>>> --- a/pbs-client/src/pxar/create.rs >>>> +++ b/pbs-client/src/pxar/create.rs > (...) >>>> + if stat_results.is_none() { >>>> + match nix::sys::stat::fstatat( >>>> + dir_fd, >>>> + file_name.to_owned().as_c_str(), >>>> + nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW, >>>> + ) { >>>> + Ok(stat) => { >>>> + stat_results = Some(stat); >>>> + } >>>> + Err(err) => { >>>> + return Err(err).context(format!( >>>> + "stat failed on {:?} with {:?}", >>>> + full_path, >>>> + err.desc() >>>> + )); >>>> + } >>>> + } >>>> + } >>> ^ Instead of this large block we then only need: >>> >>> let stat = stat_results.map(Ok).unwrap_or_else(get_file_mode)?; >>> >>>> + >>>> self.entry_counter += 1; >>>> if self.entry_counter > self.entry_limit { >>>> bail!( >>>> @@ -462,9 +493,9 @@ impl Archiver { >>>> } >>>> file_list.push(FileListEntry { >>>> - name: file_name, >>>> + name: file_name.to_owned(), >>>> path: full_path, >>>> - stat, >>>> + stat: stat_results.unwrap(), >>> ^ and this change can just be dropped :-) >>> >>>> }); >>>> } >>>> @@ -534,7 +565,7 @@ impl Archiver { >>>> if self >>>> .patterns >>>> .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode)) >>>> - == Some(MatchType::Exclude) >>>> + == Ok(Some(MatchType::Exclude)) >>> ^ Please don't just ignore errors, just put a `?` onto the call. >>> >>> On the one hand we know that this cannot error since we pass `Some`, so >>> we could even unwrap, but that's worse, since there's no guarantee that >>> no other errors can happen. >>> We *may* be able to guarantee this by adding a `GetFileMode` impl >>> in pathpatterns directly to `u32` with `type Error = >>> std::convert::Infallible`, and then drop the `Some()` wrapping around >>> this piece? >> Yes, that's nice. Adding this implementation to pathpatterns: >> >>     impl GetFileMode for u32 { >>         type Error = std::convert::Infallible; >>         fn get(self) -> Result { >>             Ok(self) >>         } >>     } >> >> then it should look like this: >> >>     if self >>         .patterns >>         .matches(match_path.as_os_str().as_bytes(), stat.st_mode)? >>             == Some(MatchType::Exclude) >>     { >>         return Ok(()); >>     } >> >>>> { >>>> return Ok(()); >>>> } >>>> diff --git a/pbs-client/src/pxar/extract.rs b/pbs-client/src/pxar/extract.rs >>>> index 4dbaf52d..e08af3c0 100644 >>>> --- a/pbs-client/src/pxar/extract.rs >>>> +++ b/pbs-client/src/pxar/extract.rs >>>> @@ -244,16 +244,17 @@ where >>>> ); >>>> let did_match = match match_result { >>>> - Some(MatchType::Include) => true, >>>> - Some(MatchType::Exclude) => false, >>>> - None => self.state.current_match, >>>> + Ok(Some(MatchType::Include)) => true, >>>> + Ok(Some(MatchType::Exclude)) => false, >>>> + _ => self.state.current_match, >>> Same here, 3 lines above we pass `Some(file_type)` to `matches()`. >>> We should not just use `_` here, but rather, the above should fail. >> Here we can write this: >> >>     let match_result = self.match_list.matches( >>         entry.path().as_os_str().as_bytes(), >>         metadata.file_type() as u32 >>     ).unwrap(); >>     let did_match = match match_result { >>         Some(MatchType::Include) => true, >>         Some(MatchType::Exclude) => false, >>         _ => self.state.current_match, >> >>     }; >> >> Unwrapping should (?) be safe because the `Err` is `Infallible`. > Yeah I guess that's fine. Note: please add comments above unwraps ;-) > > Somehow I'm thinking the stdlib should have a separate function for > this... like fn Result::unwrap_infallible(self) -> T... > But I don't really feel like it's worth it pulling in another crate for > this ;-) Yes, I agree.