From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B3E6DD6DC for ; Thu, 1 Dec 2022 14:29:51 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 8151A29F7F for ; Thu, 1 Dec 2022 14:29:21 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Thu, 1 Dec 2022 14:29:19 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 9BD70432F2 for ; Thu, 1 Dec 2022 14:29:19 +0100 (CET) Content-Type: multipart/alternative; boundary="------------M2JVKc00QYrCkeQN9nPWxtrH" Message-ID: Date: Thu, 1 Dec 2022 14:29:18 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.0 Content-Language: en-US To: Proxmox Backup Server development discussion , Lukas Wagner References: <20221129141730.740199-1-l.wagner@proxmox.com> From: Stefan Hanreich In-Reply-To: <20221129141730.740199-1-l.wagner@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL 0.707 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% HTML_MESSAGE 0.001 HTML included in message KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.257 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [diff.rs] Subject: Re: [pbs-devel] [PATCH proxmox-backup 0/2] debug cli: improve output, optionally compare file content for `diff archive` X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2022 13:29:51 -0000 This is a multi-part message in MIME format. --------------M2JVKc00QYrCkeQN9nPWxtrH Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit I reviewed this by generating 100 random files via the following command: seq -w 1 100 | xargs -n1 -I% sh -c 'dd if=/dev/urandom of=file.% bs=$(shuf -i1-10 -n1) count=1024' Then I created a backup from those files, recreated the files with the above command and created a second backup. Additionally I created a 3rd backup where I: * changed uid & gid of 10 files * changed permissions of 10 files * deleted 10 files * created 10 new files * moved some of the files into a new subdirectory * touched 10 files All changes between 1 ->2 and 2 -> 3 and 1 -> 3 were picked up by the diff tool as far as I could tell. Some files where the content has changed but everything else stayed the same (except for mtime), were not marked as changed without the flag --compare-content. Maybe we should not mark any files as changed without the --compare-content flag or we should mark all files as changed where any attribute changed? With --compare-content the changes were picked up, which is what this flag is for, so I think it's OK. Code LGTM! Reviewed-by: Stefan Hanreich Tested-by: Stefan Hanreich On 11/29/22 15:17, Lukas Wagner wrote: > This patch series contains a few improvements for the `diff archive` tool, > mainly based on Wolfgang's suggestions. > > First, the output of is now much more detailed and shows > some relevant file attributes, including what has changed between > snapshots. Changed attributes are highlighted by a "*". > > For instance: > > $ proxmox-backup-debug diff archive ... > A f 644 10045 10000 0 B 2022-11-28 13:44:51 add.txt > M f 644 10045 10000 6 B *2022-11-28 13:45:05 content.txt > D f 644 10045 10000 0 B 2022-11-28 13:17:09 deleted.txt > M f 644 10045 *29 0 B 2022-11-28 13:16:20 gid.txt > M f *777 10045 10000 0 B 2022-11-28 13:42:47 mode.txt > M f 644 10045 10000 0 B *2022-11-28 13:44:33 mtime.txt > M f 644 10045 10000 *7 B *2022-11-28 13:44:59 *size.txt > M f 644 *64045 10000 0 B 2022-11-28 13:16:18 uid.txt > M *f 644 10045 10000 10 B 2022-11-28 13:44:59 type.txt > > The second commit introduces the possiblity to pass > the --compare-content flag to the tool. If the flag is passed, > the tool will compare the file content instead of relying on mtime > alone to detect modifications. > > Lukas Wagner (2): > debug cli: show more file attributes for `diff archive` command > debug cli: add 'compare-content' flag to `diff archive` command > > src/bin/proxmox_backup_debug/diff.rs | 356 ++++++++++++++++++++++----- > 1 file changed, 299 insertions(+), 57 deletions(-) > --------------M2JVKc00QYrCkeQN9nPWxtrH Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit

I reviewed this by generating 100 random files via the following command:

seq -w 1 100 | xargs -n1 -I% sh -c 'dd if=/dev/urandom of=file.% bs=$(shuf -i1-10 -n1) count=1024'

Then I created a backup from those files, recreated the files with the above command and created a second backup.

Additionally I created a 3rd backup where I:

  • changed uid & gid of 10 files
  • changed permissions of 10 files
  • deleted 10 files
  • created 10 new files
  • moved some of the files into a new subdirectory
  • touched 10 files

All changes between 1 ->2 and 2 -> 3 and 1 -> 3 were picked up by the diff tool as far as I could tell.

Some files where the content has changed but everything else stayed the same (except for mtime), were not marked as changed without the flag --compare-content. Maybe we should not mark any files as changed without the --compare-content flag or we should mark all files as changed where any attribute changed?

With --compare-content the changes were picked up, which is what this flag is for, so I think it's OK.


Code LGTM!


Reviewed-by: Stefan Hanreich <s.hanreich@proxmox.com>
Tested-by: Stefan Hanreich <s.hanreich@proxmox.com>

On 11/29/22 15:17, Lukas Wagner wrote:
This patch series contains a few improvements for the `diff archive` tool,
mainly based on Wolfgang's suggestions.

First, the output of is now much more detailed and shows
some relevant file attributes, including what has changed between
snapshots. Changed attributes are highlighted by a "*".

For instance:

$ proxmox-backup-debug diff archive ...
A  f   644  10045  10000     0 B  2022-11-28 13:44:51  add.txt
M  f   644  10045  10000     6 B *2022-11-28 13:45:05  content.txt
D  f   644  10045  10000     0 B  2022-11-28 13:17:09  deleted.txt
M  f   644  10045    *29     0 B  2022-11-28 13:16:20  gid.txt
M  f  *777  10045  10000     0 B  2022-11-28 13:42:47  mode.txt
M  f   644  10045  10000     0 B *2022-11-28 13:44:33  mtime.txt
M  f   644  10045  10000    *7 B *2022-11-28 13:44:59 *size.txt
M  f   644 *64045  10000     0 B  2022-11-28 13:16:18  uid.txt
M *f   644  10045  10000    10 B  2022-11-28 13:44:59  type.txt

The second commit introduces the possiblity to pass
the --compare-content flag to the tool. If the flag is passed,
the tool will compare the file content instead of relying on mtime
alone to detect modifications.

Lukas Wagner (2):
  debug cli: show more file attributes for `diff archive` command
  debug cli: add 'compare-content' flag to `diff archive` command

 src/bin/proxmox_backup_debug/diff.rs | 356 ++++++++++++++++++++++-----
 1 file changed, 299 insertions(+), 57 deletions(-)

--------------M2JVKc00QYrCkeQN9nPWxtrH--