public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>,
	 Sebastian <s.schauenburg@gmail.com>
Subject: Re: [pbs-devel] Bulk initial sync from remote
Date: Fri, 26 Mar 2021 17:24:03 +0100 (CET)	[thread overview]
Message-ID: <1338533683.857.1616775843288@webmail.proxmox.com> (raw)

> Sebastian <s.schauenburg@gmail.com> hat am 26.03.2021 16:15 geschrieben:
> 
> Good afternoon everyone,
> 
> is it possible to do an initial bulk sync from a remote? (using external storage media for example)
> E.g. can all files (chunk directory etc.) be blindy copied from one pbs server to the remote pbs server using a external storage medium?

yes. a "blind copy" does risk a certain amount of inconsistency if there are any concurrent actions on the datastore (e.g., if you first copy all the snapshot metadata first, then continue with .chunks, and now a prune + GC run happens and deletes some chunks that you haven't copied yet).

you can avoid that by:
- defining the external medium as datastore, configure a 'local' remote pointing to the same node, and use the sync/pull mechanism instead of a blind copy (that will iterate over snapshots and copy associated chunks together with the snapshot metadata, so you'll never copy orphaned chunks or snapshot metadata without associated chunks). this will incur network/TLS overhead since it works over the API
- do a two-phase rsync or similar, and ensure the datastore is quiet for the final (small) sync

after moving your external disk, you need to manually create the datastore.cfg entry (or create a datastore using the GUI with a different path, and then edit it to point it to your actual path, or copy the contents from your external media into the created directory).

a datastore directory with the .chunks subdir and the backup type directories (by default: vm, ct, host) is self-contained as far as stored backups are concerned. scheduled jobs (prune, verify, GC) are stored outside, so those need to be recreated if you just have the "raw" datastore.

> Use-case: doing a initial full sync from a remote can cost a lot of bandwidth (or time), while incrementals can be small (when aren't a lot of changes).

common use case, should work with the caveats noted above :)




             reply	other threads:[~2021-03-26 16:24 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26 16:24 Fabian Grünbichler [this message]
  -- strict thread matches above, loose matches on Subject: below --
2021-03-26 15:15 Sebastian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1338533683.857.1616775843288@webmail.proxmox.com \
    --to=f.gruenbichler@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=s.schauenburg@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal