From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id AF5F561BAE for ; Mon, 7 Sep 2020 17:30:47 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 68C18D669 for ; Mon, 7 Sep 2020 17:30:47 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id F1292D5F8 for ; Mon, 7 Sep 2020 17:30:44 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id B05F144A97 for ; Mon, 7 Sep 2020 17:30:44 +0200 (CEST) From: Stefan Reiter To: pbs-devel@lists.proxmox.com Date: Mon, 7 Sep 2020 17:30:31 +0200 Message-Id: <20200907153036.9324-1-s.reiter@proxmox.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.054 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pbs-devel] [PATCH v2 0/5] Improve corrupt chunk handling X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Sep 2020 15:30:47 -0000 Verify will now rename chunks it detects as corrupted, so future backups will be forced to write them. The next GC will then clean these ".bad" files up, since it has to scan each chunk directory anyway. In case the last backup uses some of these chunks, but is not the one that failed verification, the client may still omit these chunks, which could lead to a broken backup. Patch 4 detects these cases by checking all referenced chunks for existance (which certainly adds a bit of overhead, especially to otherwise minimal dirty-bitmap backups). Additionally, the last patch makes sure all chunks have their atime updated, even if they won't be written (when they already exist), to eliminate a race with GC where the chunk might be missing after a successful backup. Also, friendly ping on: https://lists.proxmox.com/pipermail/pbs-devel/2020-September/000479.html This series makes the most sense with that patch already applied, the feedback from Dominik is addressed here. v2: * address Thomas' feedback (as well as Dietmar's comment) * add patch 5 proxmox-backup: Stefan Reiter (5): verify: fix log units verify: rename corrupted chunks with .bad extension gc: remove .bad files on garbage collect backup: check all referenced chunks actually exist backup: touch all chunks, even if they exist src/api2/backup/environment.rs | 21 +++++++++- src/api2/types/mod.rs | 3 ++ src/backup/chunk_store.rs | 73 ++++++++++++++++++++++++++++------ src/backup/datastore.rs | 5 ++- src/backup/verify.rs | 34 +++++++++++++++- 5 files changed, 120 insertions(+), 16 deletions(-) -- 2.20.1