From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 0242D98BFF for ; Mon, 13 Nov 2023 16:15:26 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id D6F1D14C40 for ; Mon, 13 Nov 2023 16:14:55 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 13 Nov 2023 16:14:55 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id F345541DE4 for ; Mon, 13 Nov 2023 16:14:54 +0100 (CET) Date: Mon, 13 Nov 2023 16:14:53 +0100 (CET) From: Christian Ebner To: =?UTF-8?Q?Fabian_Gr=C3=BCnbichler?= , Proxmox Backup Server development discussion Message-ID: <1419587687.3325.1699888493974@webmail.proxmox.com> In-Reply-To: <1699880752.fodcayz7zn.astroid@yuna.none> References: <20231109184614.1611127-1-c.ebner@proxmox.com> <1699880752.fodcayz7zn.astroid@yuna.none> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Priority: 3 Importance: Normal X-Mailer: Open-Xchange Mailer v7.10.6-Rev54 X-Originating-Client: open-xchange-appsuite X-SPAM-LEVEL: Spam detection results: 0 AWL 0.062 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pbs-devel] [PATCH-SERIES v4 pxar proxmox-backup proxmox-widget-toolkit 00/26] fix #3174: improve file-level backup X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2023 15:15:26 -0000 Thanks for your comments, some thoughts inline: > On 13.11.2023 15:23 CET Fabian Gr=C3=BCnbichler wrote: >=20 > =20 >=20 > some (high-level) comments focused on compatibility: >=20 > the catalog v2 format is used unconditionally at the moment. IMHO it > should be guarded/opt-in via --change-detection-method, since old > clients cannot parse it. While it is true that the new catalog format is not readable by an old clie= nt, the motivation to include this unconditionally was to be able to also use backups created with the default change detection mode as reference. Backups with the change detection mode set to metadata would still not be readable by older clients. I can of course make this conditional and only ever use catalogs with forma= t version 2 as reference in cases of metadata based file change detection. >=20 > else, the following would happen if a client system upgrades: >=20 > - pre-upgrade backup (readable by all clients) > - upgrade > - post-upgrade backup *with --c-d-m data* (readable by all clients, but > everything catalog related only works with new clients) > - post-upgrade backup *with --c-d-m metadata* (readable by new clients > only) >=20 > since the pxar format itself also changes (new entry types), it should > also be bumped (see below). if the new formats are then only used with > the new metadata mode, both new formats are effectively opt-in (until we > make that the default mode). having the incompatibility between old and > new clients encoded right in the magic value in the header also means we > don't spend time downloading indices and chunks only to notice at some > random point within the restore that we actually don't know how to parse > this particular pxar archive. I am not sure I understand your concern here, the latest patch series alrea= dy includes a bumped pxar archive format version with it's dedicated magic num= ber. >=20 > an additional bonus point - tools like pxar and proxmox-backup-debug > could also list the raw+parsed magic value, and in general, error > messages like: >=20 > Error: got unexpected magic number for catalog >=20 > are a lot easier to grasp than (pxar extract) >=20 > Error: encountered unexpected error during extraction >=20 > or (proxmox-backup-client restore) >=20 > Error: error extracting archive - encountered unexpected error during ex= traction >=20 > the magic values could also be backported to the oldstable client > version, to make the error messages even better ("known unsupported" vs > "unexpected"). Agreed, back-porting this should definitely improve the error messages and = make it easier to see what is going on. The obscure error context `encountered unexpected error during extraction` was however not introduced by this patc= h series, maybe this should be improved as separate patches as well? >=20 > in general, UX wise it might be nice to mark backups using the new mode, > although I am not sure how specifically (some variants - just the > version/mode, archives, archives+snapshots, ..?). I have the patches to include this in the LXC config as option for now, se= tting the metadata based change detection if the option is set accordingly. Not s= ure if this is the best way, any objections or suggestions? I can include these patches in the next version of the patch series as well. >=20 > one more peculiarity I noted while testing - doing three backups in a > row without changing the input tree at all: >=20 > - old client > - new client, mode data > - new client, mode metadata >=20 > the last snapshot has a bigger "logical" size, e.g., when doing this for > my kernel clone (6.8G), the first two have a logical size of 7.736 GiB, > while the last one is 8.064Gib. for smaller input dirs, the effect is > even more pronounced, a 56M dir with 10 dirs with one file each is > listed as 55M for the first wo, and 97.989MiB for the last one (almost > double the size!). the resulting pxar archives are actually this size, > I guess there is some optimization potential still left for this > particular case. the actual (deduplicated) difference is just two (small > test case) / eight (linux) very small chunks, so this issue is mostly > cosmetic I hope unless one really goes down the "download pxar file, > extract manually" route. Yes, I am still looking into this somewhat unexpected behavior, as the actual new chunk data by the backup run is rather small. I have an eye on this especially since Dominik noticed the bloating of the index, which should be much reduced now. But for some cases it still is rather significa= nt. >=20 > I hope to do some more in-depth testing and code review over the course > of the week!