From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Subject: [pbs-devel] [PATCH v5 proxmox-backup 44/62] client: pxar: add method for metadata comparison
Date: Tue, 7 May 2024 17:52:26 +0200 [thread overview]
Message-ID: <20240507155244.793819-45-c.ebner@proxmox.com> (raw)
In-Reply-To: <20240507155244.793819-1-c.ebner@proxmox.com>
Add method to compare metadata of current file entry against metadata
of the entry looked up in the previous backup snapshot. If the
metadata matched, the start offset pointing to the files payload
header in the payload steam is returned.
This is in preparation for reusing payload chunks for unchanged files.
Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
changes since version 4:
- refactor to `MetadataArchiveReader` type, reusable by tests with local
accessor
pbs-client/src/pxar/create.rs | 34 +++++++++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)
diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 3248fd307..7e6402de5 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -2,6 +2,7 @@ use std::collections::{HashMap, HashSet};
use std::ffi::{CStr, CString, OsStr};
use std::fmt;
use std::io::{self, Read};
+use std::mem::size_of;
use std::ops::Range;
use std::os::unix::ffi::OsStrExt;
use std::os::unix::io::{AsRawFd, FromRawFd, IntoRawFd, OwnedFd, RawFd};
@@ -21,7 +22,7 @@ use proxmox_sys::error::SysError;
use pxar::accessor::aio::{Accessor, Directory};
use pxar::accessor::ReadAt;
use pxar::encoder::{LinkOffset, SeqWrite};
-use pxar::Metadata;
+use pxar::{EntryKind, Metadata};
use proxmox_io::vec;
use proxmox_lang::c_str;
@@ -344,6 +345,37 @@ impl Archiver {
.boxed()
}
+ async fn is_reusable_entry(
+ &mut self,
+ previous_metadata_accessor: &mut Directory<MetadataArchiveReader>,
+ file_name: &Path,
+ metadata: &Metadata,
+ ) -> Result<Option<Range<u64>>, Error> {
+ if let Some(file_entry) = previous_metadata_accessor.lookup(file_name).await? {
+ if metadata == file_entry.metadata() {
+ if let EntryKind::File {
+ payload_offset: Some(offset),
+ size,
+ ..
+ } = file_entry.entry().kind()
+ {
+ let range = *offset..*offset + size + size_of::<pxar::format::Header>() as u64;
+ log::debug!(
+ "reusable: {file_name:?} at range {range:?} has unchanged metadata."
+ );
+ return Ok(Some(range));
+ }
+ log::debug!("reencode: {file_name:?} not a regular file.");
+ return Ok(None);
+ }
+ log::debug!("reencode: {file_name:?} metadata did not match.");
+ return Ok(None);
+ }
+
+ log::debug!("reencode: {file_name:?} not found in previous archive.");
+ Ok(None)
+ }
+
/// openat() wrapper which allows but logs `EACCES` and turns `ENOENT` into `None`.
///
/// The `existed` flag is set when iterating through a directory to note that we know the file
--
2.39.2
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2024-05-07 16:01 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-07 15:51 [pbs-devel] [PATCH v5 pxar proxmox-backup 00/62] fix #3174: improve file-level backup Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 01/62] format/examples: add header type `PXAR_PAYLOAD_REF` Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 02/62] decoder: add method to read payload references Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 03/62] decoder: factor out skip part from skip_entry Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 04/62] encoder: add optional output writer for file payloads Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 05/62] encoder: move to stack based state tracking Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 06/62] decoder/accessor: add optional payload input stream Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 07/62] decoder: set payload input range when decoding via accessor Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 08/62] encoder: add payload reference capability Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 09/62] encoder: add payload position capability Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 10/62] encoder: add payload advance capability Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 11/62] encoder/format: finish payload stream with marker Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 12/62] format: add payload stream start marker Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 13/62] format/encoder/decoder: new pxar entry type `Version` Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 pxar 14/62] format/encoder/decoder: new pxar entry type `Prelude` Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 proxmox-backup 15/62] client: pxar: switch to stack based encoder state Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 proxmox-backup 16/62] client: backup: factor out extension from backup target Christian Ebner
2024-05-07 15:51 ` [pbs-devel] [PATCH v5 proxmox-backup 17/62] client: pxar: combine writers into struct Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 18/62] client: pxar: add optional pxar payload writer instance Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 19/62] client: pxar: optionally split metadata and payload streams Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 20/62] client: helper: add helpers for creating reader instances Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 21/62] client: helper: add method for split archive name mapping Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 22/62] client: restore: read payload from dedicated index Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 23/62] tools: cover extension for split pxar archives Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 24/62] restore: " Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 25/62] client: mount: make split pxar archives mountable Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 26/62] api: datastore: refactor getting local chunk reader Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 27/62] api: datastore: attach optional payload " Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 28/62] catalog: shell: make split pxar archives accessible Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 29/62] www: cover metadata extension for pxar archives Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 30/62] file restore: factor out getting pxar reader Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 31/62] file restore: cover split metadata and payload archives Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 32/62] file restore: show more error context when extraction fails Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 33/62] pxar: add optional payload input for achive restore Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 34/62] pxar: add more context to extraction error Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 35/62] client: pxar: include payload offset in entry listing Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 36/62] pxar: show padding in debug output on archive list Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 37/62] datastore: dynamic index: add method to get digest Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 38/62] client: pxar: helper for lookup of reusable dynamic entries Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 39/62] upload stream: implement reused chunk injector Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 40/62] client: chunk stream: add struct to hold injection state Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 41/62] client: streams: add channels for dynamic entry injection Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 42/62] specs: add backup detection mode specification Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 43/62] client: implement prepare reference method Christian Ebner
2024-05-07 15:52 ` Christian Ebner [this message]
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 45/62] pxar: caching: add look-ahead cache types Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 46/62] fix #3174: client: pxar: enable caching and meta comparison Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 47/62] client: backup writer: add injected chunk count to stats Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 48/62] pxar: create: keep track of reused chunks and files Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 49/62] pxar: create: show chunk injection stats debug output Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 50/62] client: pxar: add helper to handle optional preludes Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 51/62] client: pxar: opt encode cli exclude patterns as Prelude Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 52/62] docs: file formats: describe split pxar archive file layout Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 53/62] docs: add section describing change detection mode Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 54/62] test-suite: add detection mode change benchmark Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 55/62] test-suite: add bin to deb, add shell completions Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 56/62] datastore: chunker: add Chunker trait Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 57/62] datastore: chunker: implement chunker for payload stream Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 58/62] client: chunk stream: switch payload stream chunker Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 59/62] client: pxar: allow to restore prelude to optional path Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 60/62] client: pxar: add archive creation with reference test Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 61/62] client: tools: add helper to raise nofile rlimit Christian Ebner
2024-05-07 15:52 ` [pbs-devel] [PATCH v5 proxmox-backup 62/62] client: pxar: set cache limit based on " Christian Ebner
2024-05-14 10:52 ` [pbs-devel] [PATCH v5 pxar proxmox-backup 00/62] fix #3174: improve file-level backup Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240507155244.793819-45-c.ebner@proxmox.com \
--to=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox