From: Marco Gaiarin <gaio@lilliput.linux.it>
To: pve-user@lists.proxmox.com
Subject: [PVE-User] ZFS corruption and recovery...
Date: Tue, 19 Mar 2024 17:31:02 +0100 [thread overview]
Message-ID: <47rock-18s.ln1@hermione.lilliput.linux.it> (raw)
In a little PVE cluster i've a 'backup server', eg an old/reconditioned
server that do simply backup storage for other nodes: apart the rpool,
there's another pool, on slow HDD, used as data repository, mainly for
rsnapshot.
As a backup server, can be powered down without much effort; last week i
need an (unused) controller within, and so i've powered off, removed the
controller, powered on.
Saturday the backup pool start to complain for errors, and also disks/kernel
complain too. for all the four disk in pool. :-(
Looking at errors, the don't seems media errors, so i've powered off the
server, looked carefully at cabling finding that probably last week removing
the controller i've inadvertently 'loosen' a power connection on the
backpane of disks, damn me.
Reviewing cable worked as expected: server start, SMART on disks say the are
good, all work as expected.
After the server start, disks start to resilver, but some errors remain: a
dozen of files and dirs in 'Permanent error list'.
Because is a backup server, i've simply removed most of the errors, doing
some turns of 'zpool scrub' and 'zpool clear -F' leading to this situation:
root@svpve3:~# zpool status -v rpool-backup
pool: rpool-backup
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub in progress since Tue Mar 19 16:58:52 2024
3.89T scanned at 2.97G/s, 745G issued at 569M/s, 13.5T total
0B repaired, 5.39% done, 06:32:11 to go
config:
NAME STATE READ WRITE CKSUM
rpool-backup ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WWZ1MBA8 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WWZ1Q7F1 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WRQ0WQ44 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WWZ1RFL5 ONLINE 0 0 0
cache
scsi-33001438037cd8921 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
rpool-backup:<0x63f218>
rpool-backup:<0x108d421>
/rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/DO/FS/P/26-02-19
/rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/DO/2012/mc
/rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/CD/mg2014/100HP507
apart the first two, other three are directory, that seems i cannot delete
anymore, errors is 'dir not empty' or 'Invalid exchange'.
How can i fix this errors?! As just stated, this is a backup server and so
loosing some files (knowing what file, of course!) it is not trouble...
Thanks.
--
Chissà perché quando si sbaglia numero il telefono non è mai occupato.
(Beppe Grillo)
reply other threads:[~2024-03-19 21:10 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47rock-18s.ln1@hermione.lilliput.linux.it \
--to=gaio@lilliput.linux.it \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox