From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [RFC v2 proxmox-backup 27/36] client: implement prepare reference method
Date: Tue, 12 Mar 2024 11:07:24 +0100 [thread overview]
Message-ID: <1710237573.g7a8c3ms4l.astroid@yuna.none> (raw)
In-Reply-To: <20240305092703.126906-28-c.ebner@proxmox.com>
On March 5, 2024 10:26 am, Christian Ebner wrote:
> Implement a method that prepares the decoder instance to access a
> previous snapshots metadata index and payload index in order to
> pass it to the pxar archiver. The archiver than can utilize these
> to compare the metadata for files to the previous state and gather
> reusable chunks.
>
> Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
> ---
> changes since version 1:
> - no changes
>
> pbs-client/src/pxar/create.rs | 13 ++++++
> pbs-client/src/pxar/mod.rs | 2 +-
> proxmox-backup-client/src/main.rs | 71 ++++++++++++++++++++++++++++++-
> 3 files changed, 83 insertions(+), 3 deletions(-)
>
> diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
> index 9ae84d37..cb0af29e 100644
> --- a/pbs-client/src/pxar/create.rs
> +++ b/pbs-client/src/pxar/create.rs
> @@ -17,6 +17,7 @@ use nix::sys::stat::{FileStat, Mode};
>
> use pathpatterns::{MatchEntry, MatchFlag, MatchList, MatchType, PatternFlag};
> use proxmox_sys::error::SysError;
> +use pxar::accessor::aio::Accessor;
> use pxar::encoder::{LinkOffset, SeqWrite};
> use pxar::Metadata;
>
> @@ -24,7 +25,9 @@ use proxmox_io::vec;
> use proxmox_lang::c_str;
> use proxmox_sys::fs::{self, acl, xattr};
>
> +use crate::RemoteChunkReader;
> use pbs_datastore::catalog::BackupCatalogWriter;
> +use pbs_datastore::dynamic_index::{DynamicIndexReader, LocalDynamicReadAt};
>
> use crate::inject_reused_chunks::InjectChunks;
> use crate::pxar::metadata::errno_is_unsupported;
> @@ -46,6 +49,16 @@ pub struct PxarCreateOptions {
> pub skip_e2big_xattr: bool,
> }
>
> +/// Statefull information of previous backups snapshots for partial backups
> +pub struct PxarPrevRef {
> + /// Reference accessor for metadata comparison
> + pub accessor: Accessor<LocalDynamicReadAt<RemoteChunkReader>>,
> + /// Reference index for reusing payload chunks
> + pub payload_index: DynamicIndexReader,
> + /// Reference archive name for partial backups
> + pub archive_name: String,
> +}
> +
> fn detect_fs_type(fd: RawFd) -> Result<i64, Error> {
> let mut fs_stat = std::mem::MaybeUninit::uninit();
> let res = unsafe { libc::fstatfs(fd, fs_stat.as_mut_ptr()) };
> diff --git a/pbs-client/src/pxar/mod.rs b/pbs-client/src/pxar/mod.rs
> index 14674b9b..24315f5f 100644
> --- a/pbs-client/src/pxar/mod.rs
> +++ b/pbs-client/src/pxar/mod.rs
> @@ -56,7 +56,7 @@ pub(crate) mod tools;
> mod flags;
> pub use flags::Flags;
>
> -pub use create::{create_archive, PxarCreateOptions};
> +pub use create::{create_archive, PxarCreateOptions, PxarPrevRef};
> pub use extract::{
> create_tar, create_zip, extract_archive, extract_sub_dir, extract_sub_dir_seq, ErrorHandler,
> OverwriteFlags, PxarExtractContext, PxarExtractOptions,
> diff --git a/proxmox-backup-client/src/main.rs b/proxmox-backup-client/src/main.rs
> index f077ddf6..8d657c15 100644
> --- a/proxmox-backup-client/src/main.rs
> +++ b/proxmox-backup-client/src/main.rs
> @@ -21,6 +21,7 @@ use proxmox_router::{cli::*, ApiMethod, RpcEnvironment};
> use proxmox_schema::api;
> use proxmox_sys::fs::{file_get_json, image_size, replace_file, CreateOptions};
> use proxmox_time::{epoch_i64, strftime_local};
> +use pxar::accessor::aio::Accessor;
> use pxar::accessor::{MaybeReady, ReadAt, ReadAtOperation};
>
> use pbs_api_types::{
> @@ -30,7 +31,7 @@ use pbs_api_types::{
> BACKUP_TYPE_SCHEMA, TRAFFIC_CONTROL_BURST_SCHEMA, TRAFFIC_CONTROL_RATE_SCHEMA,
> };
> use pbs_client::catalog_shell::Shell;
> -use pbs_client::pxar::ErrorHandler as PxarErrorHandler;
> +use pbs_client::pxar::{ErrorHandler as PxarErrorHandler, PxarPrevRef};
> use pbs_client::tools::{
> complete_archive_name, complete_auth_id, complete_backup_group, complete_backup_snapshot,
> complete_backup_source, complete_chunk_size, complete_group_or_snapshot,
> @@ -50,7 +51,7 @@ use pbs_client::{
> };
> use pbs_datastore::catalog::{BackupCatalogWriter, CatalogReader, CatalogWriter};
> use pbs_datastore::chunk_store::verify_chunk_size;
> -use pbs_datastore::dynamic_index::{BufferedDynamicReader, DynamicIndexReader};
> +use pbs_datastore::dynamic_index::{BufferedDynamicReader, DynamicIndexReader, LocalDynamicReadAt};
> use pbs_datastore::fixed_index::FixedIndexReader;
> use pbs_datastore::index::IndexFile;
> use pbs_datastore::manifest::{
> @@ -1181,6 +1182,72 @@ async fn create_backup(
> Ok(Value::Null)
> }
>
> +async fn prepare_reference(
> + target_base: &str,
> + extension: &str,
> + manifest: Option<Arc<BackupManifest>>,
> + backup_writer: &BackupWriter,
> + backup_reader: Option<Arc<BackupReader>>,
> + crypt_config: Option<Arc<CryptConfig>>,
> +) -> Result<Option<PxarPrevRef>, Error> {
> + let target = format!("{target_base}.meta.{extension}");
> + let payload_target = format!("{target_base}.pld.{extension}");
> +
> + let manifest = if let Some(ref manifest) = manifest {
> + manifest
> + } else {
> + return Ok(None);
> + };
> +
> + let backup_reader = if let Some(ref reader) = backup_reader {
> + reader
> + } else {
> + return Ok(None);
> + };
couldn't these checks be done before/at the call site and this fn take
the manifest and reader without Option? see comments for the patch where
this is used..
> +
> + let metadata_ref_index = if let Ok(index) = backup_reader
> + .download_dynamic_index(&manifest, &target)
> + .await
> + {
> + index
> + } else {
> + log::info!("No previous metadata index, fallback to regular encoding");
> + return Ok(None);
> + };
> +
> + let known_payload_chunks = Arc::new(Mutex::new(HashSet::new()));
> + let payload_ref_index = if let Ok(index) = backup_writer
> + .download_previous_dynamic_index(&payload_target, &manifest, known_payload_chunks)
> + .await
> + {
> + index
> + } else {
> + log::info!("No previous payload index, fallback to regular encoding");
> + return Ok(None);
> + };
for these two, it might make sense to differentiate between:
- previous manifest doesn't have that index -> no need to try download,
we can just skip
- previous manifest has that index -> we try to download -> we need to
handle the error (and tell the user about the error message - it might
indicate a problem after all!)
> +
> + log::info!("Using previous index as metadata reference for '{target}'");
> +
> + let most_used = metadata_ref_index.find_most_used_chunks(8);
> + let file_info = manifest.lookup_file_info(&target)?;
> + let chunk_reader = RemoteChunkReader::new(
> + backup_reader.clone(),
> + crypt_config.clone(),
> + file_info.chunk_crypt_mode(),
> + most_used,
> + );
> + let reader = BufferedDynamicReader::new(metadata_ref_index, chunk_reader);
> + let archive_size = reader.archive_size();
> + let reader = LocalDynamicReadAt::new(reader);
> + let accessor = Accessor::new(reader, archive_size).await?;
> +
> + Ok(Some(pbs_client::pxar::PxarPrevRef {
> + accessor,
> + payload_index: payload_ref_index,
> + archive_name: target,
> + }))
> +}
> +
> async fn dump_image<W: Write>(
> client: Arc<BackupReader>,
> crypt_config: Option<Arc<CryptConfig>>,
> --
> 2.39.2
>
>
>
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>
>
>
next prev parent reply other threads:[~2024-03-12 10:07 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-05 9:26 [pbs-devel] [RFC pxar proxmox-backup 00/36] fix #3174: improve file-level backup Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 01/36] format/examples: add PXAR_PAYLOAD_REF entry header Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 02/36] encoder: add optional output writer for file payloads Christian Ebner
2024-03-11 13:21 ` Fabian Grünbichler
2024-03-11 13:50 ` Christian Ebner
2024-03-11 15:41 ` Fabian Grünbichler
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 03/36] format/decoder: add method to read payload references Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 04/36] decoder: add optional payload input stream Christian Ebner
2024-03-11 13:21 ` Fabian Grünbichler
2024-03-11 14:05 ` Christian Ebner
2024-03-11 15:27 ` Fabian Grünbichler
2024-03-11 15:51 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 05/36] accessor: " Christian Ebner
2024-03-11 13:21 ` Fabian Grünbichler
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 06/36] encoder: move to stack based state tracking Christian Ebner
2024-03-11 13:21 ` Fabian Grünbichler
2024-03-11 14:12 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 07/36] encoder: add payload reference capability Christian Ebner
2024-03-11 13:21 ` Fabian Grünbichler
2024-03-11 14:15 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 08/36] encoder: add payload position capability Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 09/36] encoder: add payload advance capability Christian Ebner
2024-03-11 13:22 ` Fabian Grünbichler
2024-03-11 14:22 ` Christian Ebner
2024-03-11 15:27 ` Fabian Grünbichler
2024-03-11 15:41 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 pxar 10/36] encoder/format: finish payload stream with marker Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 11/36] client: pxar: switch to stack based encoder state Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 12/36] client: backup: factor out extension from backup target Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 13/36] client: backup: early check for fixed index type Christian Ebner
2024-03-11 14:57 ` Fabian Grünbichler
2024-03-11 15:12 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 14/36] client: backup: split payload to dedicated stream Christian Ebner
2024-03-11 14:57 ` Fabian Grünbichler
2024-03-11 15:22 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 15/36] client: restore: read payload from dedicated index Christian Ebner
2024-03-11 14:58 ` Fabian Grünbichler
2024-03-11 15:26 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 16/36] tools: cover meta extension for pxar archives Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 17/36] restore: " Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 18/36] client: mount: make split pxar archives mountable Christian Ebner
2024-03-11 14:58 ` Fabian Grünbichler
2024-03-11 15:29 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 19/36] api: datastore: refactor getting local chunk reader Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 20/36] api: datastore: attach optional payload " Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 21/36] catalog: shell: factor out pxar fuse reader instantiation Christian Ebner
2024-03-11 14:58 ` Fabian Grünbichler
2024-03-11 15:31 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 22/36] catalog: shell: redirect payload reader for split streams Christian Ebner
2024-03-11 14:58 ` Fabian Grünbichler
2024-03-11 15:24 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 23/36] www: cover meta extension for pxar archives Christian Ebner
2024-03-11 14:58 ` Fabian Grünbichler
2024-03-11 15:31 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 24/36] index: fetch chunk form index by start/end-offset Christian Ebner
2024-03-12 8:50 ` Fabian Grünbichler
2024-03-14 8:23 ` Christian Ebner
2024-03-12 12:47 ` Dietmar Maurer
2024-03-12 12:51 ` Christian Ebner
2024-03-12 13:03 ` Dietmar Maurer
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 25/36] upload stream: impl reused chunk injector Christian Ebner
2024-03-13 9:43 ` Dietmar Maurer
2024-03-14 14:03 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 26/36] client: chunk stream: add chunk injection queues Christian Ebner
2024-03-12 9:46 ` Fabian Grünbichler
2024-03-19 10:52 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 27/36] client: implement prepare reference method Christian Ebner
2024-03-12 10:07 ` Fabian Grünbichler [this message]
2024-03-19 11:51 ` Christian Ebner
2024-03-19 12:49 ` Fabian Grünbichler
2024-03-20 8:37 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 28/36] client: pxar: implement store to insert chunks on caching Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 29/36] client: pxar: add previous reference to archiver Christian Ebner
2024-03-12 12:12 ` Fabian Grünbichler
2024-03-12 12:25 ` Christian Ebner
2024-03-19 12:59 ` Christian Ebner
2024-03-19 13:04 ` Fabian Grünbichler
2024-03-20 8:52 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 30/36] client: pxar: add method for metadata comparison Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 31/36] specs: add backup detection mode specification Christian Ebner
2024-03-12 12:17 ` Fabian Grünbichler
2024-03-12 12:31 ` Christian Ebner
2024-03-20 9:28 ` Christian Ebner
2024-03-05 9:26 ` [pbs-devel] [RFC v2 proxmox-backup 32/36] pxar: caching: add look-ahead cache types Christian Ebner
2024-03-05 9:27 ` [pbs-devel] [RFC v2 proxmox-backup 33/36] client: pxar: add look-ahead caching Christian Ebner
2024-03-12 14:08 ` Fabian Grünbichler
2024-03-20 10:28 ` Christian Ebner
2024-03-05 9:27 ` [pbs-devel] [RFC v2 proxmox-backup 34/36] fix #3174: client: pxar: enable caching and meta comparison Christian Ebner
2024-03-13 11:12 ` Fabian Grünbichler
2024-03-05 9:27 ` [pbs-devel] [RFC v2 proxmox-backup 35/36] test-suite: add detection mode change benchmark Christian Ebner
2024-03-13 11:48 ` Fabian Grünbichler
2024-03-05 9:27 ` [pbs-devel] [RFC v2 proxmox-backup 36/36] test-suite: Add bin to deb, add shell completions Christian Ebner
2024-03-13 11:18 ` Fabian Grünbichler
2024-03-13 11:44 ` [pbs-devel] [RFC pxar proxmox-backup 00/36] fix #3174: improve file-level backup Fabian Grünbichler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1710237573.g7a8c3ms4l.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox