From: "Max Carrara" <m.carrara@proxmox.com>
To: "Christian Ebner" <c.ebner@proxmox.com>,
"Proxmox Backup Server development discussion"
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v8 proxmox-backup 4/9] fix #5982: garbage collection: check atime updates are honored
Date: Wed, 09 Apr 2025 12:37:13 +0200 [thread overview]
Message-ID: <D921JCX67RXY.W8KKT9VP73W9@proxmox.com> (raw)
In-Reply-To: <7c458efa-ce3b-4b6a-946c-f87098a08f2b@proxmox.com>
On Sat Apr 5, 2025 at 11:16 AM CEST, Christian Ebner wrote:
> On 4/4/25 17:41, Max Carrara wrote:
> > Regarding my comment in the cover letter: The call to
> > `self.cond_touch_path` above will fail on a directory-based datastore
> > backed by ext4 mounted with -o rw,relatime:
> >
> > 2025-04-04T16:25:44+02:00: TASK ERROR: atime safety check failed: update atime failed for chunk/file "/mnt/datastore/test-noatime/.chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8" - EPERM: Operation not permitted
> >
> > From `man 2 utimensat` (formatted for your reading pleasure):
> >
> > EPERM
> > The caller attempted to change one or both timestamps to a value
> > other than the current time, or to change one of the timestamps to
> > the current time while leaving the other timestamp unchanged, (i.e.,
> > times is not NULL, neither tv_nsec field is UTIME_NOW, and neither
> > tv_nsec field is UTIME_OMIT) and either:
> >
> > • the caller's effective user ID does not match the owner of file,
> > and the caller is not privileged (Linux: does not have the
> > CAP_FOWNER capability); or,
> >
> > • the file is marked append-only or immutable (see chattr(1)).
> >
> > Sooo, I investigated a little more, since the EPERM here initially
> > didn't make any sense to me, because the same atime check actually
> > passes when creating the datastore.
> >
> > Furthermore, the file (chunk) used for this check isn't append-only or
> > immutable (otherwise you'd see an `a` or `i` down there):
> >
> > $ lsattr .chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8
> > --------------e------- .chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8
> >
> >
> > And finally, just for completeness' sake, we're using both UTIME_NOW and
> > UTIME_OMIT in `cond_touch_path`:
> >
> > let times: [libc::timespec; 2] = [
> > // access time -> update to now
> > libc::timespec {
> > tv_sec: 0,
> > tv_nsec: libc::UTIME_NOW,
> > },
> > // modification time -> keep as is
> > libc::timespec {
> > tv_sec: 0,
> > tv_nsec: libc::UTIME_OMIT,
> > },
> > ];
> >
> >
> > According to the docs of `man 2 utimensat`, this would suggest that
> > we're not running under the expected effective UID and don't have
> > the CAP_FOWNER capability.
> >
> > This is the actual crux of the issue, since the chunk is owned by
> > `root`:
> >
> > $ ls -alh .chunks/bb9f
> > total 1.1M
> > drwxr-x--- 2 backup backup 4.0K Apr 4 17:11 .
> > drwxr-x--- 1 backup backup 1.1M Apr 4 17:11 ..
> > -rw-r--r-- 1 root root 159 Apr 4 17:11 bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8
> >
> >
> > Just to add some additional findings -- when the *check is turned off*,
> > GC on that single chunk (no backups made yet!) runs just fine:
> >
> > 2025-04-04T17:00:52+02:00: starting garbage collection on store test-noatime
> > 2025-04-04T17:00:52+02:00: Access time update check disabled by datastore tuning options.
> > 2025-04-04T17:00:52+02:00: Using access time cutoff 1d 5m, minimum access time is 2025-04-03T14:55:52Z
> > 2025-04-04T17:00:52+02:00: Start GC phase1 (mark used chunks)
> > 2025-04-04T17:00:52+02:00: Start GC phase2 (sweep unused chunks)
> > 2025-04-04T17:00:52+02:00: processed 73% (0 chunks)
> > 2025-04-04T17:00:52+02:00: Removed garbage: 0 B
> > 2025-04-04T17:00:52+02:00: Removed chunks: 0
> > 2025-04-04T17:00:52+02:00: Pending removals: 159 B (in 1 chunks)
> > 2025-04-04T17:00:52+02:00: Original data usage: 0 B
> > 2025-04-04T17:00:52+02:00: On-Disk chunks: 0
> > 2025-04-04T17:00:52+02:00: Deduplication factor: 1.00
> > 2025-04-04T17:00:52+02:00: queued notification (id=4bab0755-3f24-4b1a-99b0-de8e9613db9b)
> > 2025-04-04T17:00:52+02:00: TASK OK
> >
> > *BUT*, when creating a backup via PVE, the check actually runs, despite
> > being turned off -- this appears to be something separate that you might
> > want to check as well:
>
> That is expected if the chunk has the wrong ownership, but not because
> the fs_atime_check is executed here. Rather this happens because your VM
> backup includes an all zero chunk, does not have it as known chunk
> however (since the is no previous index file). This means the chunk will
> be uploaded to the PBS, but the chunk store detects it as already
> present. In that case the chunks atime will be updated too, but that
> then fails because of the permission issue. See the corresponding code
> paths here:
> https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-datastore/src/chunk_store.rs;h=dc267d752980a988d22e6d410855d2ceb3c8db46;hb=HEAD#l460
Ah yes, of course! I don't know why I didn't think of that. Thanks for
the clarification!
>
> >
> > INFO: starting new backup job: vzdump 100 --mode snapshot --notes-template '{{guestname}}' --remove 0 --node roxy --notification-mode auto --storage pbs-dev-test-atime
> > INFO: Starting Backup of VM 100 (qemu)
> > INFO: Backup started at 2025-04-04 17:02:49
> > INFO: status = running
> > INFO: VM Name: dns
> > INFO: include disk 'scsi0' 'local-zfs-main:vm-100-disk-0' 16G
> > INFO: backup mode: snapshot
> > INFO: ionice priority: 7
> > INFO: snapshots found (not included into backup)
> > INFO: creating Proxmox Backup Server archive 'vm/100/2025-04-04T15:02:49Z'
> > INFO: issuing guest-agent 'fs-freeze' command
> > INFO: issuing guest-agent 'fs-thaw' command
> > ERROR: VM 100 qmp command 'backup' failed - backup register image failed: command error: update atime failed for chunk/file "/mnt/datastore/test-noatime/.chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8" - EPERM: Operation not permitted
> > INFO: aborting backup job
> > INFO: resuming VM again
> > ERROR: Backup of VM 100 failed - VM 100 qmp command 'backup' failed - backup register image failed: command error: update atime failed for chunk/file "/mnt/datastore/test-noatime/.chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8" - EPERM: Operation not permitted
> > INFO: Failed at 2025-04-04 17:02:49
> > INFO: Backup job finished with errors
> > TASK ERROR: job errors
> >
> > Just to be sure:
> >
> > $ cat /etc/proxmox-backup/datastore.cfg
> > # [...]
> > datastore: test-noatime
> > comment
> > gc-schedule daily
> > notification-mode notification-system
> > path /mnt/datastore/test-noatime
> > tuning gc-atime-safety-check=0
> >
> >
> >
> > So, fixing the permissions with:
> >
> > # chown backup: .chunks/bb9f/bb9f8df61474d25e71fa00722318cd387396ca1736605e1248821cc0de3d3af8
> >
> > .. allows the check to pass:
> >
> > 2025-04-04T17:14:34+02:00: starting garbage collection on store test-noatime
> > 2025-04-04T17:14:35+02:00: Access time update check successful, proceeding with GC.
> > 2025-04-04T17:14:35+02:00: Using access time cutoff 1d 5m, minimum access time is 2025-04-03T15:09:34Z
> > 2025-04-04T17:14:35+02:00: Start GC phase1 (mark used chunks)
> > 2025-04-04T17:14:35+02:00: Start GC phase2 (sweep unused chunks)
> > 2025-04-04T17:14:35+02:00: processed 73% (0 chunks)
> > 2025-04-04T17:14:35+02:00: Removed garbage: 0 B
> > 2025-04-04T17:14:35+02:00: Removed chunks: 0
> > 2025-04-04T17:14:35+02:00: Original data usage: 0 B
> > 2025-04-04T17:14:35+02:00: On-Disk chunks: 1
> > 2025-04-04T17:14:35+02:00: Deduplication factor: 0.00
> > 2025-04-04T17:14:35+02:00: Average chunk size: 159 B
> > 2025-04-04T17:14:35+02:00: queued notification (id=06d76349-26af-43da-9ddb-5b0ee2fabfa8)
> > 2025-04-04T17:14:35+02:00: TASK OK
> >
> > So, guess you just have to change the UID and GID when creating the
> > test chunk ;P
>
> Yes, in v9 of the series, the chunk files ownership will be set if an
> `root`s UID is detected as effective UID. This fixes the issue here and
> makes the chunk insertion more robust in general IMHO.
>
> Thanks again for catching this early! If you manage to give it another
> spin would be great.
Yeah I saw that--and it's already applied, too. So, guess another spin's
not needed ;)
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-04-09 10:37 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-21 8:02 [pbs-devel] [PATCH v8 proxmox proxmox-backup 0/9] fix #5982: check atime update is honored Christian Ebner
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox 1/9] pbs api types: add garbage collection atime safety check flag Christian Ebner
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox 2/9] pbs api types: add option to set GC chunk cleanup atime cutoff Christian Ebner
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox-backup 3/9] garbage collection: format error including anyhow error context Christian Ebner
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox-backup 4/9] fix #5982: garbage collection: check atime updates are honored Christian Ebner
2025-04-04 15:41 ` Max Carrara
2025-04-05 9:16 ` Christian Ebner
2025-04-09 10:37 ` Max Carrara [this message]
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox-backup 5/9] ui: expose GC atime safety check flag in datastore tuning options Christian Ebner
2025-04-04 15:41 ` Max Carrara
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox-backup 6/9] docs: mention GC atime update check for " Christian Ebner
2025-04-04 15:41 ` Max Carrara
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox-backup 7/9] datastore: use custom GC atime cutoff if set Christian Ebner
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox-backup 8/9] ui: expose GC atime cutoff in datastore tuning option Christian Ebner
2025-03-21 8:02 ` [pbs-devel] [PATCH v8 proxmox-backup 9/9] docs: mention gc-atime-cutoff as " Christian Ebner
2025-04-04 15:41 ` [pbs-devel] [PATCH v8 proxmox proxmox-backup 0/9] fix #5982: check atime update is honored Max Carrara
2025-04-04 15:56 ` Max Carrara
2025-04-04 15:56 ` Christian Ebner
2025-04-04 16:10 ` Max Carrara
2025-04-05 9:06 ` [pbs-devel] superseded: " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=D921JCX67RXY.W8KKT9VP73W9@proxmox.com \
--to=m.carrara@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal