public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Christian Ebner <c.ebner@proxmox.com>
To: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>,
	"Proxmox Backup Server development discussion"
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH-SERIES v4 pxar proxmox-backup proxmox-widget-toolkit 00/26] fix #3174: improve file-level backup
Date: Mon, 13 Nov 2023 16:14:53 +0100 (CET)	[thread overview]
Message-ID: <1419587687.3325.1699888493974@webmail.proxmox.com> (raw)
In-Reply-To: <1699880752.fodcayz7zn.astroid@yuna.none>

Thanks for your comments, some thoughts inline:

> On 13.11.2023 15:23 CET Fabian Grünbichler <f.gruenbichler@proxmox.com> wrote:
> 
>  
> 
> some (high-level) comments focused on compatibility:
> 
> the catalog v2 format is used unconditionally at the moment. IMHO it
> should be guarded/opt-in via --change-detection-method, since old
> clients cannot parse it.

While it is true that the new catalog format is not readable by an old client,
the motivation to include this unconditionally was to be able to also use
backups created with the default change detection mode as reference.
Backups with the change detection mode set to metadata would still not be
readable by older clients.

I can of course make this conditional and only ever use catalogs with format
version 2 as reference in cases of metadata based file change detection.

> 
> else, the following would happen if a client system upgrades:
> 
> - pre-upgrade backup (readable by all clients)
> - upgrade
> - post-upgrade backup *with --c-d-m data* (readable by all clients, but
>   everything catalog related only works with new clients)
> - post-upgrade backup *with --c-d-m metadata* (readable by new clients
>   only)
> 
> since the pxar format itself also changes (new entry types), it should
> also be bumped (see below). if the new formats are then only used with
> the new metadata mode, both new formats are effectively opt-in (until we
> make that the default mode). having the incompatibility between old and
> new clients encoded right in the magic value in the header also means we
> don't spend time downloading indices and chunks only to notice at some
> random point within the restore that we actually don't know how to parse
> this particular pxar archive.

I am not sure I understand your concern here, the latest patch series already
includes a bumped pxar archive format version with it's dedicated magic number.

> 
> an additional bonus point - tools like pxar and proxmox-backup-debug
> could also list the raw+parsed magic value, and in general, error
> messages like:
> 
>  Error: got unexpected magic number for catalog
> 
> are a lot easier to grasp than (pxar extract)
> 
>  Error: encountered unexpected error during extraction
> 
> or (proxmox-backup-client restore)
> 
>  Error: error extracting archive - encountered unexpected error during extraction
> 
> the magic values could also be backported to the oldstable client
> version, to make the error messages even better ("known unsupported" vs
> "unexpected").

Agreed, back-porting this should definitely improve the error messages and make
it easier to see what is going on. The obscure error context `encountered
unexpected error during extraction` was however not introduced by this patch
series, maybe this should be improved as separate patches as well?

> 
> in general, UX wise it might be nice to mark backups using the new mode,
> although I am not sure how specifically (some variants - just the
> version/mode, archives, archives+snapshots, ..?).

I have the patches to include  this in the LXC config as option for now, setting
the metadata based change detection if the option is set accordingly. Not sure
if this is the best way, any objections or suggestions? I can include these
patches in the next version of the patch series as well.

> 
> one more peculiarity I noted while testing - doing three backups in a
> row without changing the input tree at all:
> 
> - old client
> - new client, mode data
> - new client, mode metadata
> 
> the last snapshot has a bigger "logical" size, e.g., when doing this for
> my kernel clone (6.8G), the first two have a logical size of 7.736 GiB,
> while the last one is 8.064Gib. for smaller input dirs, the effect is
> even more pronounced, a 56M dir with 10 dirs with one file each is
> listed as 55M for the first wo, and 97.989MiB for the last one (almost
> double the size!). the resulting pxar archives are actually this size,
> I guess there is some optimization potential still left for this
> particular case. the actual (deduplicated) difference is just two (small
> test case) / eight (linux) very small chunks, so this issue is mostly
> cosmetic I hope unless one really goes down the "download pxar file,
> extract manually" route.

Yes, I am still looking into this somewhat unexpected behavior, as the
actual new chunk data by the backup run is rather small. I have an eye on
this especially since Dominik noticed the bloating of the index, which
should be much reduced now. But for some cases it still is rather significant.

> 
> I hope to do some more in-depth testing and code review over the course
> of the week!




  reply	other threads:[~2023-11-13 15:15 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-09 18:45 Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 1/26] fix #3174: decoder: factor out skip_bytes from skip_entry Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 2/26] fix #3174: decoder: impl skip_bytes for sync dec Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 3/26] fix #3174: encoder: calc filename + metadata byte size Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 4/26] fix #3174: enc/dec: impl PXAR_APPENDIX_REF entrytype Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 5/26] fix #3174: enc/dec: impl PXAR_APPENDIX entrytype Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 6/26] fix #3174: encoder: helper to add to encoder position Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 pxar 7/26] fix #3174: enc/dec: impl PXAR_APPENDIX_TAIL entrytype Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 08/26] fix #3174: index: add fn index list from start/end-offsets Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 09/26] fix #3174: index: add fn digest for DynamicEntry Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 10/26] fix #3174: api: double catalog upload size Christian Ebner
2023-11-09 18:45 ` [pbs-devel] [PATCH v4 proxmox-backup 11/26] fix #3174: catalog: introduce extended format v2 Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 12/26] fix #3174: archiver/extractor: impl appendix ref Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 13/26] fix #3174: catalog: add specialized Archive entry Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 14/26] fix #3174: extractor: impl seq restore from appendix Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 15/26] fix #3174: archiver: store ref to previous backup Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 16/26] fix #3174: upload stream: impl reused chunk injector Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 17/26] fix #3174: chunker: add forced boundaries Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 18/26] fix #3174: backup writer: inject queued chunk in upload steam Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 19/26] fix #3174: archiver: reuse files with unchanged metadata Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 20/26] fix #3174: specs: add backup detection mode specification Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 21/26] fix #3174: client: Add detection mode to backup creation Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 22/26] test-suite: add detection mode change benchmark Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 23/26] test-suite: Add bin to deb, add shell completions Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 24/26] catalog: fetch offset and size for files and refs Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-backup 25/26] pxar: add heuristic to reduce reused chunk fragmentation Christian Ebner
2023-11-09 18:46 ` [pbs-devel] [PATCH v4 proxmox-widget-toolkit 26/26] file-browser: support pxar archive and fileref types Christian Ebner
2023-11-13 14:23 ` [pbs-devel] [PATCH-SERIES v4 pxar proxmox-backup proxmox-widget-toolkit 00/26] fix #3174: improve file-level backup Fabian Grünbichler
2023-11-13 15:14   ` Christian Ebner [this message]
2023-11-13 15:21     ` Christian Ebner
2023-11-13 15:35     ` Fabian Grünbichler
2023-11-13 15:45       ` Christian Ebner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1419587687.3325.1699888493974@webmail.proxmox.com \
    --to=c.ebner@proxmox.com \
    --cc=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal