From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 54E17BA5DE for ; Tue, 19 Mar 2024 22:10:12 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 3253F62C9 for ; Tue, 19 Mar 2024 22:10:12 +0100 (CET) Received: from picard.linux.it (picard.linux.it [IPv6:2001:1418:10:5::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Tue, 19 Mar 2024 22:10:10 +0100 (CET) Received: by picard.linux.it (Postfix, from userid 10) id 139B83CFD3C; Tue, 19 Mar 2024 22:10:04 +0100 (CET) Received: from news by eraldo.lilliput.linux.it with local (Exim 4.92) (envelope-from ) id 1rmgen-0006vH-Qn for pve-user@lists.proxmox.com; Tue, 19 Mar 2024 22:06:01 +0100 From: Marco Gaiarin Date: Tue, 19 Mar 2024 17:31:02 +0100 Organization: Il gaio usa sempre TIN per le liste, fallo anche tu!!! Message-ID: <47rock-18s.ln1@hermione.lilliput.linux.it> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 8bit X-Trace: eraldo.lilliput.linux.it 1710880937 23088 192.168.1.24 (19 Mar 2024 20:42:17 GMT) X-Mailer: tin/2.6.2-20220130 ("Convalmore") (Linux/5.15.0-101-generic (x86_64)) X-Gateway-System: SmartGate 1.4.5 To: pve-user@lists.proxmox.com X-SPAM-LEVEL: Spam detection results: 0 AWL -0.632 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DATE_IN_PAST_03_06 1.592 Date: is 3 to 6 hours before Received: date DMARC_PASS -0.1 DMARC pass policy JMQ_SPF_NEUTRAL 0.5 SPF set to ?all KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment POISEN_SPAM_PILL 0.1 Meta: its spam POISEN_SPAM_PILL_1 0.1 random spam to be learned in bayes POISEN_SPAM_PILL_3 0.1 random spam to be learned in bayes SPF_HELO_PASS -0.001 SPF: HELO matches SPF record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [github.io] Subject: [PVE-User] ZFS corruption and recovery... X-BeenThere: pve-user@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE user list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Mar 2024 21:10:12 -0000 In a little PVE cluster i've a 'backup server', eg an old/reconditioned server that do simply backup storage for other nodes: apart the rpool, there's another pool, on slow HDD, used as data repository, mainly for rsnapshot. As a backup server, can be powered down without much effort; last week i need an (unused) controller within, and so i've powered off, removed the controller, powered on. Saturday the backup pool start to complain for errors, and also disks/kernel complain too. for all the four disk in pool. :-( Looking at errors, the don't seems media errors, so i've powered off the server, looked carefully at cabling finding that probably last week removing the controller i've inadvertently 'loosen' a power connection on the backpane of disks, damn me. Reviewing cable worked as expected: server start, SMART on disks say the are good, all work as expected. After the server start, disks start to resilver, but some errors remain: a dozen of files and dirs in 'Permanent error list'. Because is a backup server, i've simply removed most of the errors, doing some turns of 'zpool scrub' and 'zpool clear -F' leading to this situation: root@svpve3:~# zpool status -v rpool-backup pool: rpool-backup state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub in progress since Tue Mar 19 16:58:52 2024 3.89T scanned at 2.97G/s, 745G issued at 569M/s, 13.5T total 0B repaired, 5.39% done, 06:32:11 to go config: NAME STATE READ WRITE CKSUM rpool-backup ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ata-ST8000VN004-3CP101_WWZ1MBA8 ONLINE 0 0 0 ata-ST8000VN004-3CP101_WWZ1Q7F1 ONLINE 0 0 0 ata-ST8000VN004-3CP101_WRQ0WQ44 ONLINE 0 0 0 ata-ST8000VN004-3CP101_WWZ1RFL5 ONLINE 0 0 0 cache scsi-33001438037cd8921 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: rpool-backup:<0x63f218> rpool-backup:<0x108d421> /rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/DO/FS/P/26-02-19 /rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/DO/2012/mc /rpool-backup/rsnapshot/daily.bad/FVG_PP/vdmpp2/srv/media/CD/mg2014/100HP507 apart the first two, other three are directory, that seems i cannot delete anymore, errors is 'dir not empty' or 'Invalid exchange'. How can i fix this errors?! As just stated, this is a backup server and so loosing some files (knowing what file, of course!) it is not trouble... Thanks. -- Chissā perché quando si sbaglia numero il telefono non č mai occupato. (Beppe Grillo)