From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <c.ebner@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 4FD5995EE0
 for <pbs-devel@lists.proxmox.com>; Wed, 28 Feb 2024 15:09:49 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 33759E13F
 for <pbs-devel@lists.proxmox.com>; Wed, 28 Feb 2024 15:09:19 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pbs-devel@lists.proxmox.com>; Wed, 28 Feb 2024 15:09:18 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 497B447BE0
 for <pbs-devel@lists.proxmox.com>; Wed, 28 Feb 2024 15:02:53 +0100 (CET)
From: Christian Ebner <c.ebner@proxmox.com>
To: pbs-devel@lists.proxmox.com
Date: Wed, 28 Feb 2024 15:02:20 +0100
Message-Id: <20240228140226.1251979-31-c.ebner@proxmox.com>
X-Mailer: git-send-email 2.39.2
In-Reply-To: <20240228140226.1251979-1-c.ebner@proxmox.com>
References: <20240228140226.1251979-1-c.ebner@proxmox.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.045 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 T_SCC_BODY_TEXT_LINE    -0.01 -
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [create.rs]
Subject: [pbs-devel] [RFC proxmox-backup 30/36] client: pxar: add method for
 metadata comparison
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 28 Feb 2024 14:09:49 -0000

Adds a method to compare the metadata of the current file entry
against the metadata of the entry looked up in the previous backup
snapshot.

If the metadata matched, the start offset for the payload stream is
returned.

This is in preparation for reusing payload chunks for unchanged files.

Signed-off-by: Christian Ebner <c.ebner@proxmox.com>
---
 pbs-client/src/pxar/create.rs | 33 +++++++++++++++++++++++++++++++--
 1 file changed, 31 insertions(+), 2 deletions(-)

diff --git a/pbs-client/src/pxar/create.rs b/pbs-client/src/pxar/create.rs
index 6713daf3..39864483 100644
--- a/pbs-client/src/pxar/create.rs
+++ b/pbs-client/src/pxar/create.rs
@@ -17,9 +17,9 @@ use nix::sys::stat::{FileStat, Mode};
 
 use pathpatterns::{MatchEntry, MatchFlag, MatchList, MatchType, PatternFlag};
 use proxmox_sys::error::SysError;
-use pxar::accessor::aio::Accessor;
+use pxar::accessor::aio::{Accessor, Directory};
 use pxar::encoder::{LinkOffset, PayloadOffset, SeqWrite};
-use pxar::Metadata;
+use pxar::{EntryKind, Metadata};
 
 use proxmox_io::vec;
 use proxmox_lang::c_str;
@@ -422,6 +422,35 @@ impl Archiver {
         .boxed()
     }
 
+    async fn is_reusable_entry(
+        &mut self,
+        accessor: &mut Directory<LocalDynamicReadAt<RemoteChunkReader>>,
+        file_name: &Path,
+        stat: &FileStat,
+        metadata: &Metadata,
+    ) -> Result<Option<u64>, Error> {
+        if stat.st_nlink > 1 {
+            log::debug!("re-encode: {file_name:?} has hardlinks.");
+            return Ok(None);
+        }
+
+        if let Some(file_entry) = accessor.lookup(file_name).await? {
+            if metadata == file_entry.metadata() {
+                if let EntryKind::File { payload_offset, .. } = file_entry.entry().kind() {
+                    log::debug!("re-use: {file_name:?} has unchanged metadata.");
+                    return Ok(payload_offset.clone());
+                }
+                log::debug!("re-encode: {file_name:?} not a regular file.");
+                return Ok(None);
+            }
+            log::debug!("re-encode: {file_name:?} metadata did not match.");
+            return Ok(None);
+        }
+
+        log::debug!("re-encode: {file_name:?} not found in previous archive.");
+        Ok(None)
+    }
+
     /// openat() wrapper which allows but logs `EACCES` and turns `ENOENT` into `None`.
     ///
     /// The `existed` flag is set when iterating through a directory to note that we know the file
-- 
2.39.2