all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug
@ 2023-11-29  9:27 Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

cherry-picked from upstream's staging-2.2.2 and verified using the
reproducer script on an affected machine.

kernel side (actual fix):

Fabian Grünbichler (2):
  update zfs to 2.2.0-pve4
  bump version to 6.5.11-6

 Makefile              | 2 +-
 debian/changelog      | 6 ++++++
 submodules/zfsonlinux | 2 +-
 3 files changed, 8 insertions(+), 2 deletions(-)

zfs userspace (for version consistency/reduced confusion):

Fabian Grünbichler (2):
  cherry-pick fix for data corruption
  bump version to 2.2.0-pve4

 debian/changelog                              |  7 ++
 ...heck-dnode-and-its-data-for-dirtines.patch | 97 +++++++++++++++++++
 debian/patches/series                         |  1 +
 3 files changed, 105 insertions(+)
 create mode 100644 debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch

-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6 Fabian Grünbichler
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

which contains the fix for https://github.com/openzfs/zfs/issues/15526

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 submodules/zfsonlinux | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/submodules/zfsonlinux b/submodules/zfsonlinux
index e295f30..00036e5 160000
--- a/submodules/zfsonlinux
+++ b/submodules/zfsonlinux
@@ -1 +1 @@
-Subproject commit e295f30e6acc962f76a71306a569f292994614c0
+Subproject commit 00036e5a6eba553a9f6d116c52c7189f773ac24d
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4 Fabian Grünbichler
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 Makefile         | 2 +-
 debian/changelog | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index f0d119c..991e39e 100644
--- a/Makefile
+++ b/Makefile
@@ -6,7 +6,7 @@ KERNEL_MIN=5
 KERNEL_PATCHLEVEL=11
 # increment KREL for every published package release!
 # rebuild packages with new KREL and run 'make abiupdate'
-KREL=5
+KREL=6
 
 KERNEL_MAJMIN=$(KERNEL_MAJ).$(KERNEL_MIN)
 KERNEL_VER=$(KERNEL_MAJMIN).$(KERNEL_PATCHLEVEL)
diff --git a/debian/changelog b/debian/changelog
index 646728e..e5442aa 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+proxmox-kernel-6.5 (6.5.11-6) bookworm; urgency=medium
+
+  * cherry-pick ZFS fix for (rare) dirty dnode data corruption bug
+
+ -- Proxmox Support Team <support@proxmox.com>  Wed, 29 Nov 2023 09:32:26 +0100
+
 proxmox-kernel-6.5 (6.5.11-5) bookworm; urgency=medium
 
   * properly set CONFIG_VFIO_VIRQFD as a boolean
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6 Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4 Fabian Grünbichler
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

cherry-picked from 2.2.0-staging, fixing
https://github.com/openzfs/zfs/issues/15526

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 ...heck-dnode-and-its-data-for-dirtines.patch | 97 +++++++++++++++++++
 debian/patches/series                         |  1 +
 2 files changed, 98 insertions(+)
 create mode 100644 debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch

diff --git a/debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch b/debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
new file mode 100644
index 0000000..f79b09b
--- /dev/null
+++ b/debian/patches/0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
@@ -0,0 +1,97 @@
+From 9b9b09f452a469458451c221debfbab944e7f081 Mon Sep 17 00:00:00 2001
+From: Rob N <robn@despairlabs.com>
+Date: Wed, 29 Nov 2023 04:15:48 +1100
+Subject: [PATCH] dnode_is_dirty: check dnode and its data for dirtiness
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+Over its history this the dirty dnode test has been changed between
+checking for a dnodes being on `os_dirty_dnodes` (`dn_dirty_link`) and
+`dn_dirty_record`.
+
+  de198f2d9 Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
+  2531ce372 Revert "Report holes when there are only metadata changes"
+  ec4f9b8f3 Report holes when there are only metadata changes
+  454365bba Fix dirty check in dmu_offset_next()
+  66aca2473 SEEK_HOLE should not block on txg_wait_synced()
+
+Also illumos/illumos-gate@c543ec060d illumos/illumos-gate@2bcf0248e9
+
+It turns out both are actually required.
+
+In the case of appending data to a newly created file, the dnode proper
+is dirtied (at least to change the blocksize) and dirty records are
+added.  Thus, a single logical operation is represented by separate
+dirty indicators, and must not be separated.
+
+The incorrect dirty check becomes a problem when the first block of a
+file is being appended to while another process is calling lseek to skip
+holes. There is a small window where the dnode part is undirtied while
+there are still dirty records. In this case, `lseek(fd, 0, SEEK_DATA)`
+would not know that the file is dirty, and would go to
+`dnode_next_offset()`. Since the object has no data blocks yet, it
+returns `ESRCH`, indicating no data found, which results in `ENXIO`
+being returned to `lseek()`'s caller.
+
+Since coreutils 9.2, `cp` performs sparse copies by default, that is, it
+uses `SEEK_DATA` and `SEEK_HOLE` against the source file and attempts to
+replicate the holes in the target. When it hits the bug, its initial
+search for data fails, and it goes on to call `fallocate()` to create a
+hole over the entire destination file.
+
+This has come up more recently as users upgrade their systems, getting
+OpenZFS 2.2 as well as a newer coreutils. However, this problem has been
+reproduced against 2.1, as well as on FreeBSD 13 and 14.
+
+This change simply updates the dirty check to check both types of dirty.
+If there's anything dirty at all, we immediately go to the "wait for
+sync" stage, It doesn't really matter after that; both changes are on
+disk, so the dirty fields should be correct.
+
+Sponsored-by: Klara, Inc.
+Sponsored-by: Wasabi Technology, Inc.
+Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
+Reviewed-by: Alexander Motin <mav@FreeBSD.org>
+Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
+Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
+Closes #15571
+Closes #15526
+Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
+---
+ module/zfs/dnode.c | 12 ++++++++++--
+ 1 file changed, 10 insertions(+), 2 deletions(-)
+
+diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
+index 7cf03264d..ad9988366 100644
+--- a/module/zfs/dnode.c
++++ b/module/zfs/dnode.c
+@@ -1764,7 +1764,14 @@ dnode_try_claim(objset_t *os, uint64_t object, int slots)
+ }
+ 
+ /*
+- * Checks if the dnode contains any uncommitted dirty records.
++ * Checks if the dnode itself is dirty, or is carrying any uncommitted records.
++ * It is important to check both conditions, as some operations (eg appending
++ * to a file) can dirty both as a single logical unit, but they are not synced
++ * out atomically, so checking one and not the other can result in an object
++ * appearing to be clean mid-way through a commit.
++ *
++ * Do not change this lightly! If you get it wrong, dmu_offset_next() can
++ * detect a hole where there is really data, leading to silent corruption.
+  */
+ boolean_t
+ dnode_is_dirty(dnode_t *dn)
+@@ -1772,7 +1779,8 @@ dnode_is_dirty(dnode_t *dn)
+ 	mutex_enter(&dn->dn_mtx);
+ 
+ 	for (int i = 0; i < TXG_SIZE; i++) {
+-		if (multilist_link_active(&dn->dn_dirty_link[i])) {
++		if (multilist_link_active(&dn->dn_dirty_link[i]) ||
++		    !list_is_empty(&dn->dn_dirty_records[i])) {
+ 			mutex_exit(&dn->dn_mtx);
+ 			return (B_TRUE);
+ 		}
+-- 
+2.39.2
+
diff --git a/debian/patches/series b/debian/patches/series
index d20a605..ac820c7 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -15,3 +15,4 @@
 0015-Fix-block-cloning-between-unencrypted-and-encrypted-.patch
 0016-Add-a-tunable-to-disable-BRT-support.patch
 0017-zfs-2.2.1-Disable-block-cloning-by-default.patch
+0018-dnode_is_dirty-check-dnode-and-its-data-for-dirtines.patch
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4
  2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
                   ` (2 preceding siblings ...)
  2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption Fabian Grünbichler
@ 2023-11-29  9:27 ` Fabian Grünbichler
  3 siblings, 0 replies; 5+ messages in thread
From: Fabian Grünbichler @ 2023-11-29  9:27 UTC (permalink / raw)
  To: pve-devel

Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
 debian/changelog | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/debian/changelog b/debian/changelog
index 5bd9d2a..f2fa727 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,10 @@
+zfs-linux (2.2.0-pve4) bookworm; urgency=medium
+
+  * pick bug-fix staged for 2.2.2:
+    - fix (rare) corruption caused by dirty dnode being treated as clean
+
+ -- Proxmox Support Team <support@proxmox.com>  Wed, 29 Nov 2023 09:21:26 +0100
+
 zfs-linux (2.2.0-pve3) bookworm; urgency=medium
 
   * pick bug-fixes staged for 2.2.1:
-- 
2.39.2





^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-29  9:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-29  9:27 [pve-devel] (already applied) [PATCH zfsonlinux/kernel 0/4] Fix ZFS data corruption bug Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 1/2] update zfs to 2.2.0-pve4 Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH pve-kernel 2/2] bump version to 6.5.11-6 Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 1/2] cherry-pick fix for data corruption Fabian Grünbichler
2023-11-29  9:27 ` [pve-devel] [PATCH zfsonlinux 2/2] bump version to 2.2.0-pve4 Fabian Grünbichler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal