From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id D2245E600 for ; Tue, 26 Sep 2023 09:15:52 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id B45AF3599A for ; Tue, 26 Sep 2023 09:15:52 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 26 Sep 2023 09:15:51 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 6823D446C5 for ; Tue, 26 Sep 2023 09:15:51 +0200 (CEST) Date: Tue, 26 Sep 2023 09:15:50 +0200 (CEST) From: Christian Ebner To: pbs-devel@lists.proxmox.com Message-ID: <1301290754.4714.1695712550183@webmail.proxmox.com> In-Reply-To: <20230922071621.12670-1-c.ebner@proxmox.com> References: <20230922071621.12670-1-c.ebner@proxmox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Priority: 3 Importance: Normal X-Mailer: Open-Xchange Mailer v7.10.6-Rev50 X-Originating-Client: open-xchange-appsuite X-SPAM-LEVEL: Spam detection results: 0 AWL 0.100 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [RFC pxar proxmox-backup 00/20] fix #3174: improve file-level backup X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Sep 2023 07:15:52 -0000 Thomas suggested to include some form of benchmark, which might be useful not only for measuring performance but rather might be used as regression test in a CI pipeline and/or used to optimize possible tunable parameters. > On 22.09.2023 09:16 CEST Christian Ebner wrote: > > > This (still rather rough) series of patches prototypes a possible > approach to improve the pxar file level backup creation speed. > The series is intended to get a first feedback on the implementation > approach and to find possible pitfalls I might not be aware of. > > The current approach is to skip encoding of regular file payloads, > for which metadata (currently mtime and size) did not change as > compared to a previous backup run. Instead of re-encoding the files, a > reference to a newly introduced appendix section of the pxar archive > will be written. The appenidx section will be created as concatination > of indexed chunks from the previous backup run, thereby containing the > sequential file payload at a calculated offset with respect to the > starting point of the appendix section. > > Metadata comparison and caclulation of the chunks to be indexed for the > appendix section is performed using the catalog of a previous backup as > reference. In order to be able to calculate the offsets, the current > catalog format is extended to include the file offset with respect to > the pxar archive byte stream. This allows to find the required chunks > indexes, the start padding within the concatenated chunks and the total > bytes introduced by the chunks. > > During encoding, the chunks needed for the appendix section are injected > in the pxar archive after forcing a chunk boundary when regular pxar > encoding is finished. Finally, the pxar archive containing an appenidx > section are marked as such by appending a final pxar goodbye lookup > table only containing the offset to the appendix section start and total > size of that section, needed for random access as e.g. for mounting the > archive via the fuse filesystem implementation. > > Currently, the code assumes the reference backup (for which the previous > run is used) to be a regular backup without appendix section, and the > catalog for that backup to already contain the required additional > offset information. > > An invocation therefore looks lile: > ```bash > proxmox-backup-client backup