From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: <pbs-devel-bounces@lists.proxmox.com> Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 6D15E1FF176 for <inbox@lore.proxmox.com>; Fri, 21 Feb 2025 15:01:37 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 4E5143300; Fri, 21 Feb 2025 15:01:35 +0100 (CET) From: Christian Ebner <c.ebner@proxmox.com> To: pbs-devel@lists.proxmox.com Date: Fri, 21 Feb 2025 15:01:05 +0100 Message-Id: <20250221140110.377328-1-c.ebner@proxmox.com> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.031 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [datastore.rs] Subject: [pbs-devel] [PATCH proxmox-backup 0/5] GC: avoid multiple atime updates X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion <pbs-devel.lists.proxmox.com> List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe> List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/> List-Post: <mailto:pbs-devel@lists.proxmox.com> List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help> List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe> Reply-To: Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com> Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" <pbs-devel-bounces@lists.proxmox.com> This patches implement the logic to greatly improve the performance of phase 1 garbage collection by avoiding multiple atime updates on the same chunk. Currently, phase 1 GC iterates over all folders in the datastore looking and collecting all image index files without taking any logical assumptions (e.g. namespaces, groups, snapshots, ...). This is to avoid accidentally missing image index files located in unexpected paths and therefore not marking their chunks as in use, leading to potential data losses. This patches improve phase 1 by inserting encountered index image paths into a data structure which allows to iterate the index files in a more logical manner, following the same principle as for incremental backup snapshots. The index files for the same namespace and group as well as image filename can therefore be consecutevly inspected. Further, by keeping track of already seen and therefore updated chunk atimes, it is now avoided to update the atime over and over again on the chunks shared by consecutive backup snaphshots. To give some ballpark figures, this reduced phase 1 garbage collection on a real world datastore containing some of my backups from around 2 minutes to about 16 seconds. Christian Ebner (5): datastore: restrict datastores list_images method scope to module garbage collection: refactor archive type based chunk marking logic garbage collection: add structure for optimized image iteration garbage collection: allow to keep track of already touched chunks fix #5331: garbage collection: avoid multiple chunk atime updates pbs-datastore/src/datastore.rs | 204 ++++++++++++++++++++++++++------- 1 file changed, 160 insertions(+), 44 deletions(-) -- 2.39.5 _______________________________________________ pbs-devel mailing list pbs-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel