From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id A8A8120EC8A for ; Mon, 29 Apr 2024 14:13:33 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 44FA4100A7; Mon, 29 Apr 2024 14:13:44 +0200 (CEST) Message-ID: <7ebd7071-7a53-4b83-8333-f05d61c9f868@proxmox.com> Date: Mon, 29 Apr 2024 14:13:07 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: pbs-devel@lists.proxmox.com References: <20240328123707.336951-1-c.ebner@proxmox.com> Content-Language: en-US, de-DE From: Christian Ebner In-Reply-To: <20240328123707.336951-1-c.ebner@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.029 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH v3 pxar proxmox-backup 00/58] fix #3174: improve file-level backup X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox Backup Server development discussion Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: pbs-devel-bounces@lists.proxmox.com Sender: "pbs-devel" On 3/28/24 13:36, Christian Ebner wrote: > A big thank you to Dietmar and Fabian for the review of the previous > version and Fabian for extensive testing and help during debugging. > > This series of patches implements an metadata based file change > detection mechanism for improved pxar file level backup creation speed > for unchanged files. > > The chosen approach is to split pxar archives on creation via the > proxmox-backup-client into two separate data and upload streams, > one exclusive for regular file payloads, the other one for the rest > of the pxar archive, which is mostly metadata. > > On consecutive runs, the metadata archive of the previous backup run, > which is limited in size and therefore rapidly accessed is used to > lookup and compare the metadata for entries to encode. > This assumes that the connection speed to the Proxmox Backup Server is > sufficiently fast, allowing the download and chaching of the chunks for > that index. > > Changes to regular files are detected by comparing all of the files > metadata object, including mtime, acls, ecc. If no changes are detected, > the previous payload index is used to lookup chunks to possibly re-use > in the payload stream of the new archive. > In order to reduce possible chunk fragmentation, the decision whether to > re-use or re-encode a file payload is deferred until enough information > is gathered by adding entries to a look-ahead cache. If the padding > introduced by reusing chunks falls below a threshold, the entries are > referenced, the chunks are re-used and injected into the pxar payload > upload stream, otherwise they are discated and the files encoded > regularly. > > The following lists the most notable changes included in this series since > the version 2: > - many bugfixes regarding incorrect archive encoding by wrong offset > generation, adding additional sanity checks and rather fail on > encoding than produce an incorrectly encoded archive > - different approach for deciding whether to re-use or re-encode the > entries. Previously, the entries have been encoded when a cached > payload size threshold was reached. Now, the padding introduced by > reusable chunks is tracked, and only if the padding does not exceed > the set threshold, the entries are re-used. This reduces the possible > padding, at the cost of re-encoding more entries. Also avoids to > re-use chunks which have now large padding holes because of > moved/removed files contained within. > - added headers for metadata archive and payload file > - added documentation > > An invocation of a backup run with this patches now is: > ```bash > proxmox-backup-client backup