From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 9B0AF95CF5 for ; Wed, 28 Feb 2024 15:03:17 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 7D075CD57 for ; Wed, 28 Feb 2024 15:02:47 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Wed, 28 Feb 2024 15:02:46 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 3D6E947A0C for ; Wed, 28 Feb 2024 15:02:46 +0100 (CET) From: Christian Ebner To: pbs-devel@lists.proxmox.com Date: Wed, 28 Feb 2024 15:01:50 +0100 Message-Id: <20240228140226.1251979-1-c.ebner@proxmox.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.049 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [catalog.rs, mount.rs, catar.rs, proxmox.com, lib.rs, api.rs, create.rs, mod.rs, pxarcmd.rs, aio.rs, datastore.rs, main.rs, sync.rs, mk-format-hashes.rs] Subject: [pbs-devel] [RFC pxar proxmox-backup 00/36] fix #3174: improve file-level backup X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Feb 2024 14:03:17 -0000 Disclaimer: This patches are work in progress and not intended for production use just yet. The purpose is for initial testing and review. This series of patches implements an metadata based file change detection mechanism for improved pxar file level backup creation speed for unchanged files. The chosen approach is to split pxar archives on creation via the proxmox-backup-client into two separate archives and upload streams, one exclusive for regular file payloads, the other one for the rest of the pxar archive, which is mostly metadata. On consecutive runs, the metadata archive of the previous backup run, which is limited in size and therefore rapidly accessed is used to lookup and compare the metadata for entries to encode. This assumes that the connection speed to the Proxmox Backup Server is sufficiently fast, allowing the download and chaching of the chunks for that index. Changes to regular files are detected by comparing all of the files metadata object, including mtime, acls, ecc. If no changes are detected, the previous payload index is used to lookup chunks to possibly re-use in the payload stream of the new archive. In order to reduce possible chunk fragmentation, the decision wether to re-use or re-encode a file payload is deferred until enough information is gathered by adding entries to a look-ahead cache. If enough payload is referenced, the chunks are re-used and injected into the pxar payload upload stream, otherwise they are discated and the files encoded regularly. An invocation of a backup run with this patches now is: ```bash proxmox-backup-client backup