From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 12352D83F for ; Mon, 21 Aug 2023 14:18:35 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D653616DF2 for ; Mon, 21 Aug 2023 14:18:04 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 21 Aug 2023 14:18:04 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D47A142D36 for ; Mon, 21 Aug 2023 14:18:03 +0200 (CEST) Date: Mon, 21 Aug 2023 14:18:02 +0200 From: Wolfgang Bumiller To: Gabriel Goller Cc: pbs-devel@lists.proxmox.com Message-ID: References: <20230818143212.91113-1-g.goller@proxmox.com> <20230818143212.91113-2-g.goller@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-SPAM-LEVEL: Spam detection results: 0 AWL 0.104 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH proxmox-backup v5] fix #4380: check if file is excluded before running `stat()` X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Aug 2023 12:18:35 -0000 On Mon, Aug 21, 2023 at 01:03:56PM +0200, Gabriel Goller wrote: > > On 8/21/23 10:41, Wolfgang Bumiller wrote: > > needs a rebase, and comments inline: > > > > On Fri, Aug 18, 2023 at 04:32:12PM +0200, Gabriel Goller wrote: > > > @@ -1108,7 +1108,7 @@ impl<'a> ExtractorState<'a> { > > > async fn handle_new_directory( > > > &mut self, > > > entry: catalog::DirEntry, > > > - match_result: Option, > > > + match_result: Result, FileModeRequired>, > > ^ This is not needed, let's not take errors with us throughout the > > process. > > > > This is our catalog shell. The catalog file always contains file types > > for its contents except for hardlinks, since we cannot resolve them. > > However, hardlinks also cannot recurse down any further, since they're > > also never directories, so `handle_new_directory` will never actually > > get `Err()` passed here. > > > > > ) -> Result<(), Error> { > > > // enter a new directory: > > > self.read_dir_stack.push(mem::replace( > > > @@ -1123,7 +1123,7 @@ impl<'a> ExtractorState<'a> { > > > Shell::walk_pxar_archive(self.accessor, &mut self.dir_stack).await?; > > > let dir_pxar = self.dir_stack.last().unwrap().pxar.as_ref().unwrap(); > > > let dir_meta = dir_pxar.entry().metadata().clone(); > > > - let create = self.matches && match_result != Some(MatchType::Exclude); > > > + let create = self.matches && match_result != Ok(Some(MatchType::Exclude)); > > ^ so no need for this change > > > > > self.extractor > > > .enter_directory(dir_pxar.file_name().to_os_string(), dir_meta, create)?; > > > @@ -1133,9 +1133,9 @@ impl<'a> ExtractorState<'a> { > > > pub async fn handle_entry(&mut self, entry: catalog::DirEntry) -> Result<(), Error> { > > > let match_result = self.match_list.matches(&self.path, entry.get_file_mode()); > > I'd like to point out that if we don't know what to do, we should fail, > > so, just use `?` here. > > > > Since this is an interactive tool, we *may* get away with just logging > > this and continuing with the matches you wrote out below here. > > > > But we should not just be silent about it :-) > > > > Personally, I'm fine with erroring and telling the user to investigate. > > In practice I doubt anyone will really run into this. The catalog shell > > is too awkward to have many users (I think???) > How about we introduce this log message, then use the question mark > operator? This would produce both a warning and an error message then, no? Or did you mean something else? > >     match (did_match, &entry.attr) { >         (_, DirEntryAttribute::Directory { .. }) => { >             if match_result.is_err() { >                 log::error!("error while matching file paths"); >             } >             self.handle_new_directory(entry, match_result?).await?; >         } >         ... >     } > > > let did_match = match match_result { > > > - Some(MatchType::Include) => true, > > > - Some(MatchType::Exclude) => false, > > > - None => self.matches, > > > + Ok(Some(MatchType::Include)) => true, > > > + Ok(Some(MatchType::Exclude)) => false, > > > + _ => self.matches, > > > }; > > > match (did_match, &entry.attr) { > > > diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs > > > index 2577cf98..4a844599 100644 > > > --- a/pbs-client/src/pxar/create.rs > > > +++ b/pbs-client/src/pxar/create.rs (...) > > > + if stat_results.is_none() { > > > + match nix::sys::stat::fstatat( > > > + dir_fd, > > > + file_name.to_owned().as_c_str(), > > > + nix::fcntl::AtFlags::AT_SYMLINK_NOFOLLOW, > > > + ) { > > > + Ok(stat) => { > > > + stat_results = Some(stat); > > > + } > > > + Err(err) => { > > > + return Err(err).context(format!( > > > + "stat failed on {:?} with {:?}", > > > + full_path, > > > + err.desc() > > > + )); > > > + } > > > + } > > > + } > > ^ Instead of this large block we then only need: > > > > let stat = stat_results.map(Ok).unwrap_or_else(get_file_mode)?; > > > > > + > > > self.entry_counter += 1; > > > if self.entry_counter > self.entry_limit { > > > bail!( > > > @@ -462,9 +493,9 @@ impl Archiver { > > > } > > > file_list.push(FileListEntry { > > > - name: file_name, > > > + name: file_name.to_owned(), > > > path: full_path, > > > - stat, > > > + stat: stat_results.unwrap(), > > ^ and this change can just be dropped :-) > > > > > }); > > > } > > > @@ -534,7 +565,7 @@ impl Archiver { > > > if self > > > .patterns > > > .matches(match_path.as_os_str().as_bytes(), Some(stat.st_mode)) > > > - == Some(MatchType::Exclude) > > > + == Ok(Some(MatchType::Exclude)) > > ^ Please don't just ignore errors, just put a `?` onto the call. > > > > On the one hand we know that this cannot error since we pass `Some`, so > > we could even unwrap, but that's worse, since there's no guarantee that > > no other errors can happen. > > We *may* be able to guarantee this by adding a `GetFileMode` impl > > in pathpatterns directly to `u32` with `type Error = > > std::convert::Infallible`, and then drop the `Some()` wrapping around > > this piece? > > Yes, that's nice. Adding this implementation to pathpatterns: > >     impl GetFileMode for u32 { >         type Error = std::convert::Infallible; >         fn get(self) -> Result { >             Ok(self) >         } >     } > > then it should look like this: > >     if self >         .patterns >         .matches(match_path.as_os_str().as_bytes(), stat.st_mode)? >             == Some(MatchType::Exclude) >     { >         return Ok(()); >     } > > > > { > > > return Ok(()); > > > } > > > diff --git a/pbs-client/src/pxar/extract.rs b/pbs-client/src/pxar/extract.rs > > > index 4dbaf52d..e08af3c0 100644 > > > --- a/pbs-client/src/pxar/extract.rs > > > +++ b/pbs-client/src/pxar/extract.rs > > > @@ -244,16 +244,17 @@ where > > > ); > > > let did_match = match match_result { > > > - Some(MatchType::Include) => true, > > > - Some(MatchType::Exclude) => false, > > > - None => self.state.current_match, > > > + Ok(Some(MatchType::Include)) => true, > > > + Ok(Some(MatchType::Exclude)) => false, > > > + _ => self.state.current_match, > > Same here, 3 lines above we pass `Some(file_type)` to `matches()`. > > We should not just use `_` here, but rather, the above should fail. > > Here we can write this: > >     let match_result = self.match_list.matches( >         entry.path().as_os_str().as_bytes(), >         metadata.file_type() as u32 >     ).unwrap(); >     let did_match = match match_result { >         Some(MatchType::Include) => true, >         Some(MatchType::Exclude) => false, >         _ => self.state.current_match, > >     }; > > Unwrapping should (?) be safe because the `Err` is `Infallible`. Yeah I guess that's fine. Note: please add comments above unwraps ;-) Somehow I'm thinking the stdlib should have a separate function for this... like fn Result::unwrap_infallible(self) -> T... But I don't really feel like it's worth it pulling in another crate for this ;-)