public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug
@ 2023-11-29  9:27 Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

cherry-picked from upstream's staging-2.2.2 and verified using the
reproducer script on an affected machine.

kernel side (actual fix):

Fabian Grünbichler (2):
  update zfs to 2.2.0-pve4
  bump version to 6.5.11-6

 Makefile              | 2 +-
 debian/changelog      | 6 ++++++
 submodules/zfsonlinux | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

zfs userspace (for version consistency/reduced confusion):

Fabian Grünbichler (2):
  cherry-pick fix for data corruption
  bump version to 2.2.0-pve4

 debian/changelog                              |  7 ++
 ...heck-dnode-and-its-data-for-dirtines.patch | 97 +++++++++++++++++++
 debian/patches/series                         |  1 +
 3 files changed, 105 insertions(+)
 create mode 100644 debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch

-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6 Fabian Grünbichler
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

which contains the fix for https://github.com/openzfs/zfs/issues/15526

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 submodules/zfsonlinux | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/submodules/zfsonlinux b/submodules/zfsonlinux
index e295f30..00036e5 160000
--- a/submodules/zfsonlinux
+++ b/submodules/zfsonlinux
@@ -1 +1 @@
-Subproject commit e295f30e6acc962f76a71306a569f292994614c0
+Subproject commit 00036e5a6eba553a9f6d116c52c7189f773ac24d
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4 Fabian Grünbichler
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 Makefile         | 2 +-
 debian/changelog | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f0d119c..991e39e 100644
--- a/Makefile
+++ b/Makefile
@@ -6,7 +6,7 @@ KERNEL_MIN=5
 KERNEL_PATCHLEVEL=11
 # increment KREL for every published package release!
 # rebuild packages with new KREL and run 'make abiupdate'
-KREL=5
+KREL=6
 
 KERNEL_MAJMIN=$(KERNEL_MAJ).$(KERNEL_MIN)
 KERNEL_VER=$(KERNEL_MAJMIN).$(KERNEL_PATCHLEVEL)
diff --git a/debian/changelog b/debian/changelog
index 646728e..e5442aa 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+proxmox-kernel-6.5 (6.5.11-6) bookworm; urgency=medium
+
+  * cherry-pick ZFS fix for (rare) dirty dnode data corruption bug
+
+ -- Proxmox Support Team <support@proxmox.com>  Wed, 29 Nov 2023 09:32:26 +0100
+
 proxmox-kernel-6.5 (6.5.11-5) bookworm; urgency=medium
 
   * properly set CONFIG_VFIO_VIRQFD as a boolean
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6 Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4 Fabian Grünbichler
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

cherry-picked from 2.2.0-staging, fixing
https://github.com/openzfs/zfs/issues/15526

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 ...heck-dnode-and-its-data-for-dirtines.patch | 97 +++++++++++++++++++
 debian/patches/series                         |  1 +
 2 files changed, 98 insertions(+)
 create mode 100644 debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch

diff --git a/debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch b/debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
new file mode 100644
index 0000000..f79b09b
--- /dev/null
+++ b/debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
@@ -0,0 +1,97 @@
+From 9b9b09f452a469458451c221debfbab944e7f081 Mon Sep 17 00:00:00 2001
+From: Rob N <robn@despairlabs.com>
+Date: Wed, 29 Nov 2023 04:15:48 +1100
+Subject: [PATCH] dnode_is_dirty: check dnode and its data for dirtiness
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Over its history this the dirty dnode test has been changed between
+checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
+`dn_dirty_record`.
+
+  de198f2d9 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
+  2531ce372 Revert "Report holes when there are only metadata changes"
+  ec4f9b8f3 Report holes when there are only metadata changes
+  454365bba Fix dirty check in dmu_offset_next()
+  66aca2473 SEEK_HOLE should not block on txg_wait_synced()
+
+Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9
+
+It turns out both are actually required.
+
+In the case of appending data to a newly created file, the dnode proper
+is dirtied (at least to change the blocksize) and dirty records are
+added.  Thus, a single logical operation is represented by separate
+dirty indicators, and must not be separated.
+
+The incorrect dirty check becomes a problem when the first block of a
+file is being appended to while another process is calling lseek to skip
+holes. There is a small window where the dnode part is undirtied while
+there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
+would not know that the file is dirty, and would go to
+`dnode_next_offset()`. Since the object has no data blocks yet, it
+returns `ESRCH`, indicating no data found, which results in `ENXIO`
+being returned to `lseek()`'s caller.
+
+Since coreutils 9.2, `cp` performs sparse copies by default, that is, it
+uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to
+replicate the holes in the target. When it hits the bug, its initial
+search for data fails, and it goes on to call `fallocate()` to create a
+hole over the entire destination file.
+
+This has come up more recently as users upgrade their systems, getting
+OpenZFS 2.2 as well as a newer coreutils. However, this problem has been
+reproduced against 2.1, as well as on FreeBSD 13 and 14.
+
+This change simply updates the dirty check to check both types of dirty.
+If there's anything dirty at all, we immediately go to the "wait for
+sync" stage, It doesn't really matter after that; both changes are on
+disk, so the dirty fields should be correct.
+
+Sponsored-by: Klara, Inc.
+Sponsored-by: Wasabi Technology, Inc.
+Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
+Reviewed-by: Alexander Motin <mav@FreeBSD.org>
+Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
+Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
+Closes #15571
+Closes #15526
+Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
+---
+ module/zfs/dnode.c | 12 ++++++++++--
+ 1 file changed, 10 insertions(+), 2 deletions(-)
+
+diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
+index 7cf03264d..ad9988366 100644
+--- a/module/zfs/dnode.c
++++ b/module/zfs/dnode.c
+@@ -1764,7 +1764,14 @@ dnode_try_claim(objset_t *os, uint64_t object, int slots)
+ }
+ 
+ /*
+- * Checks if the dnode contains any uncommitted dirty records.
++ * Checks if the dnode itself is dirty, or is carrying any uncommitted records.
++ * It is important to check both conditions, as some operations (eg appending
++ * to a file) can dirty both as a single logical unit, but they are not synced
++ * out atomically, so checking one and not the other can result in an object
++ * appearing to be clean mid-way through a commit.
++ *
++ * Do not change this lightly! If you get it wrong, dmu_offset_next() can
++ * detect a hole where there is really data, leading to silent corruption.
+  */
+ boolean_t
+ dnode_is_dirty(dnode_t *dn)
+@@ -1772,7 +1779,8 @@ dnode_is_dirty(dnode_t *dn)
+ 	mutex_enter(&dn->dn_mtx);
+ 
+ 	for (int i = 0; i < TXG_SIZE; i++) {
+-		if (multilist_link_active(&dn->dn_dirty_link[i])) {
++		if (multilist_link_active(&dn->dn_dirty_link[i]) ||
++		    !list_is_empty(&dn->dn_dirty_records[i])) {
+ 			mutex_exit(&dn->dn_mtx);
+ 			return (B_TRUE);
+ 		}
+-- 
+2.39.2
+
diff --git a/debian/patches/series b/debian/patches/series
index d20a605..ac820c7 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -15,3 +15,4 @@
 0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
 0016-Add-a-tunable-to-disable-BRT-support.patch
 0017-zfs-2.2.1-Disable-block-cloning-by-default.patch
+0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
                   ` (2 preceding siblings ...)
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 debian/changelog | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/debian/changelog b/debian/changelog
index 5bd9d2a..f2fa727 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,10 @@
+zfs-linux (2.2.0-pve4) bookworm; urgency=medium
+
+  * pick bug-fix staged for 2.2.2:
+    - fix (rare) corruption caused by dirty dnode being treated as clean
+
+ -- Proxmox Support Team <support@proxmox.com>  Wed, 29 Nov 2023 09:21:26 +0100
+
 zfs-linux (2.2.0-pve3) bookworm; urgency=medium
 
   * pick bug-fixes staged for 2.2.1:
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-29  9:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6 Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4 Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal