* [PVE-User] ZFS corruption and recovery...
@ 2024-03-19 16:31 Marco Gaiarin
0 siblings, 0 replies; only message in thread
From: Marco Gaiarin @ 2024-03-19 16:31 UTC (permalink / raw)
To: pve-user
In a little PVE cluster i've a 'backup server', eg an old/reconditioned
server that do simply backup storage for other nodes: apart the rpool,
there's another pool, on slow HDD, used as data repository, mainly for
rsnapshot.
As a backup server, can be powered down without much effort; last week i
need an (unused) controller within, and so i've powered off, removed the
controller, powered on.
Saturday the backup pool start to complain for errors, and also disks/kernel
complain too. for all the four disk in pool. :-(
Looking at errors, the don't seems media errors, so i've powered off the
server, looked carefully at cabling finding that probably last week removing
the controller i've inadvertently 'loosen' a power connection on the
backpane of disks, damn me.
Reviewing cable worked as expected: server start, SMART on disks say the are
good, all work as expected.
After the server start, disks start to resilver, but some errors remain: a
dozen of files and dirs in 'Permanent error list'.
Because is a backup server, i've simply removed most of the errors, doing
some turns of 'zpool scrub' and 'zpool clear -F' leading to this situation:
root@svpve3:~# zpool status -v rpool-backup
pool: rpool-backup
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub in progress since Tue Mar 19 16:58:52 2024
3.89T scanned at 2.97G/s, 745G issued at 569M/s, 13.5T total
0B repaired, 5.39% done, 06:32:11 to go
config:
NAME STATE READ WRITE CKSUM
rpool-backup ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WWZ1MBA8 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WWZ1Q7F1 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WRQ0WQ44 ONLINE 0 0 0
ata-ST8000VN004-3CP101_WWZ1RFL5 ONLINE 0 0 0
cache
scsi-33001438037cd8921 ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
rpool-backup:<0x63f218>
rpool-backup:<0x108d421>
/rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/DO/FS/P/26-02-19
/rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/DO/2012/mc
/rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/CD/mg2014/100HP507
apart the first two, other three are directory, that seems i cannot delete
anymore, errors is 'dir not empty' or 'Invalid exchange'.
How can i fix this errors?! As just stated, this is a backup server and so
loosing some files (knowing what file, of course!) it is not trouble...
Thanks.
--
Chissà perché quando si sbaglia numero il telefono non è mai occupato.
(Beppe Grillo)
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2024-03-19 21:10 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-19 16:31 [PVE-User] ZFS corruption and recovery Marco Gaiarin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.