From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 13AC2A9F2 for ; Fri, 8 Sep 2023 11:36:23 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id EA168FA20 for ; Fri, 8 Sep 2023 11:36:22 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Fri, 8 Sep 2023 11:36:20 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 542EE43523 for ; Fri, 8 Sep 2023 11:36:20 +0200 (CEST) Message-ID: <4111ea6e-b478-488e-b954-d6d9f3add1d4@proxmox.com> Date: Fri, 8 Sep 2023 11:36:17 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-GB To: Gabriel Goller , Proxmox Backup Server development discussion References: <20230905093747.54879-1-g.goller@proxmox.com> <482af171-fcb0-4dc6-b662-66d1511fc4a3@proxmox.com> From: Thomas Lamprecht In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.079 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [velas.com] Subject: Re: [pbs-devel] [PATCH proxmox-backup] fix #4823: datastore: ignore vanished files when walking directory X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Sep 2023 09:36:23 -0000 On 08/09/2023 09:41, Gabriel Goller wrote: >> that's some deep indentation level.. not a must, but maybe you find >> some good/simple way to refactor some of this to make it a bit less >> crowded here (if, then in a separate patch please) > I could check if `err` is an `io:Error`, thus returning early. Then calling `.unwrap()` > to get the actual `io::Error` later on. I re-checked the whole code and pushed a clean-up that reduces the indentation level and also code lines. Can you please re-check on that if to ensure I did not missed something by mistake? Oh, and while checking this out I noticed something odd, i.e., that we silence any error that isn't a io::Error one, as of Walkdir::Error rustdocs [0] that can only happen if there's a (symlink) loop, but still. So I checked the history and it seems that I caused this way back in 2020 by the original addition of the "ignore lost+found" EPERM commit [1]. But just making that an error again now might be a breaking change, depending on what effect symlink loops currently actually have on GC. Can you test this please? If it does makes GC endlessly loop then we can just introduce the error again, as it's now broken already anyway. But if walkdir breaks the loop, or the like, then we might just make that an log with warning level. As we do not enable WalkDir's opt-in follow_symlinks it might not matter after all as the Loop error shouldn't be triggered if symlinks aren't followed, but still good to test and be certain. [0]: https://rust.velas.com/walkdir/struct.Error.html#method.io_error [1]: c3b090ac ("backup: list images: handle walkdir error, catch "lost+found"") >>> +                            // only allow to skip ext4 fsck directory, avoid GC if, for example, >>> +                            // a user got file permissions wrong on datastore rsync to new server >>> +                            if err.depth() > 1 || !path.ends_with("lost+found") { >>> +                                bail!("cannot continue garbage-collection safely, permission denied on: {:?}", path) >>> +                            } >>> +                        } >>> +                        io::ErrorKind::NotFound => { >>> +                            // ignore vanished file >> would be still good to log that here, at least at debug level >> if it can be noisy; but as there wasn't many that run into this >> in the four years of PBS existing I'd guess a always visible >> level is fine as long as the log message doesn't sounds scary. > How about a "ignoring/skipping vanished file: {path}" on the info log level? Yeah, something like that, but I'd use just one of "ignoring" or "skipping" Oh, and I would also mention in the commit message that backups that are just being created (where a atomic rename from e.g. a tmpdir to the actual one might also cause a ENOENT) are not problematic here, as GC is handling those explicitly anyway.