From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id A746EAE2D for ; Fri, 8 Sep 2023 14:56:19 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 90A7012159 for ; Fri, 8 Sep 2023 14:56:19 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 8 Sep 2023 14:56:18 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 4A7E843276 for ; Fri, 8 Sep 2023 14:56:18 +0200 (CEST) Message-ID: <318f786b-a715-d99b-7819-1b25899bb6ac@proxmox.com> Date: Fri, 8 Sep 2023 14:56:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0 Content-Language: en-US To: Thomas Lamprecht , Proxmox Backup Server development discussion References: <20230905093747.54879-1-g.goller@proxmox.com> <482af171-fcb0-4dc6-b662-66d1511fc4a3@proxmox.com> <4111ea6e-b478-488e-b954-d6d9f3add1d4@proxmox.com> From: Gabriel Goller In-Reply-To: <4111ea6e-b478-488e-b954-d6d9f3add1d4@proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.319 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -1.473 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH proxmox-backup] fix #4823: datastore: ignore vanished files when walking directory X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Sep 2023 12:56:19 -0000 On 9/8/23 11:36, Thomas Lamprecht wrote: > On 08/09/2023 09:41, Gabriel Goller wrote: >>> that's some deep indentation level.. not a must, but maybe you find >>> some good/simple way to refactor some of this to make it a bit less >>> crowded here (if, then in a separate patch please) >> I could check if `err` is an `io:Error`, thus returning early. Then calling `.unwrap()` >> to get the actual `io::Error` later on. > I re-checked the whole code and pushed a clean-up that reduces the > indentation level and also code lines. Can you please re-check on that > if to ensure I did not missed something by mistake? > > Oh, and while checking this out I noticed something odd, i.e., that we > silence any error that isn't a io::Error one, as of Walkdir::Error > rustdocs [0] that can only happen if there's a (symlink) loop, but > still. So I checked the history and it seems that I caused this way > back in 2020 by the original addition of the "ignore lost+found" EPERM > commit [1]. > But just making that an error again now might be a breaking change, > depending on what effect symlink loops currently actually have on GC. > Can you test this please? If it does makes GC endlessly loop then we > can just introduce the error again, as it's now broken already anyway. > But if walkdir breaks the loop, or the like, then we might just make > that an log with warning level. As we do not enable WalkDir's opt-in > follow_symlinks it might not matter after all as the Loop error > shouldn't be triggered if symlinks aren't followed, but still good > to test and be certain. > > [0]: https://rust.velas.com/walkdir/struct.Error.html#method.io_error > [1]: c3b090ac ("backup: list images: handle walkdir error, catch "lost+found"") We don't follow symbolic links when traversing files. `walkdir` has a builder function `follow_links()` with which you can select if you want to follow symbolic links or not, the default is no [0]. [0]: https://docs.rs/walkdir/latest/walkdir/struct.WalkDir.html#method.follow_links >>>> +                            // only allow to skip ext4 fsck directory, avoid GC if, for example, >>>> +                            // a user got file permissions wrong on datastore rsync to new server >>>> +                            if err.depth() > 1 || !path.ends_with("lost+found") { >>>> +                                bail!("cannot continue garbage-collection safely, permission denied on: {:?}", path) >>>> +                            } >>>> +                        } >>>> +                        io::ErrorKind::NotFound => { >>>> +                            // ignore vanished file >>> would be still good to log that here, at least at debug level >>> if it can be noisy; but as there wasn't many that run into this >>> in the four years of PBS existing I'd guess a always visible >>> level is fine as long as the log message doesn't sounds scary. >> How about a "ignoring/skipping vanished file: {path}" on the info log level? > Yeah, something like that, but I'd use just one of "ignoring" or > "skipping" > > Oh, and I would also mention in the commit message that backups that > are just being created (where a atomic rename from e.g. a tmpdir to > the actual one might also cause a ENOENT) are not problematic here, > as GC is handling those explicitly anyway. Submitted a new patch.