From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: Friedrich Weber <f.weber@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH zfsonlinux] cherry-pick fix for overgrown dnode cache
Date: Mon, 28 Jul 2025 18:33:31 +0200 [thread overview]
Message-ID: <20250728183331.0e60f508@rosa.proxmox.com> (raw)
In-Reply-To: <5f3e46ed-bf99-45e2-b497-fc81dc50d9b3@proxmox.com>
On Mon, 28 Jul 2025 11:52:23 +0200
Friedrich Weber <f.weber@proxmox.com> wrote:
> On 23/07/2025 20:15, Stoiko Ivanov wrote:
> > the following patch seems applicable and might fix an issue observed
> > in our enterprise support a while ago. containers run in their own
> > cgroups, thus were probably not scanned by the kernel shrinker - this
> > resulted in Dnode cache numbers of 300+% reported in arc_summary.
> >
> > FWICT the issue was introduced in ZFS 2.2.7
> > (commit 5f73630e9cbea5efa23d16809f06e0d08523b241 see:
> > https://github.com/openzfs/zfs/issues/17052#issuecomment-3065907783)
> > but I assume that the increase of zfs_arc_max by default makes it
> > trigger OOMs far easier.
> >
> > The discussion of the PR was quite instructive:
> > https://github.com/openzfs/zfs/pull/17542
> >
> > minimally tested on a pair of trixie VMs (building + running
> > replication of a couple of containers)
> >
> > Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
> > Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> > ---
>
> FWIW, tested this by setting up ZFS on root with an additional RAID-0
> for guest disks, creating a Debian container and running a script [1] in it.
>
> ARC size targets as reported by arc_summary:
>
> Target size (adaptive): 100.0 % 794.0 MiB
> Min size (hard limit): 31.3 % 248.2 MiB
> Max size (high water): 3:1 794.0 MiB
>
> With 6.14.8-2-pve (ZFS 2.3.3 without this patch), the Dnode Cache and
> ARC grow considerably while the script is running and both stay like
> this after the script has exited:
>
> ARC size (current): 294.8 % 2.3 GiB
> Dnode cache target: 10.0 % 79.4 MiB
> Dnode cache size: 1181.9 % 938.5 MiB
>
> Same on
> - 6.8.12-13-pve (ZFS 2.2.8)
> - 6.8.12-8-pve (ZFS 2.2.7)
>
> With 6.8.12-6-pve (ZFS 2.2.6), the Dnode cache size still grows to >100%
> and seems to stay there, but the ARC manages to stay below 100%:
>
> ARC size (current): 96.8 % 768.9 MiB
> Dnode cache target: 10.0 % 79.4 MiB
> Dnode cache size: 333.6 % 264.9 MiB
>
> With this patch on top of 6.8.12-9-pve, the Dnode Cache and ARC still
> grow while the script is running, but both shrink again to <100% quickly
> afterwards (within a minute or so):
>
> ARC size (current): 30.9 % 245.3 MiB
> Dnode cache target: 10.0 % 79.4 MiB
> Dnode cache size: 99.0 % 78.6 MiB
>
> We have an issue in enterprise support with a container-heavy workload
> on ZFS 2.2.7 that is likely affected by this. However, they also saw
> high Dnode cache size and >100% ARC on ZFS 2.2.6 -- the latter I
> couldn't reproduce with ZFS 2.2.6, but perhaps I missed some factor.
Thanks for the tests, short reproducer and feedback!
>
> Might be nice if we could make the fix available on PVE 8 as well,
> though I'm not sure how easily this can be backported to ZFS 2.2.
sent my attempt at a backport:
https://lore.proxmox.com/pve-devel/20250728163041.1287899-1-s.ivanov@proxmox.com/T/#u
>
> [1]
>
> #!/bin/bash
> for h in $(seq 100); do
> (
> mkdir "dir-$h"
> cd "dir-$h" || exit 1
> for i in $(seq 100);
> do
> mkdir "$i"
> for j in $(seq 100);
> do
> echo test > "$i/$j.txt"
> done
> done
> )&
> done
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
prev parent reply other threads:[~2025-07-28 16:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-23 18:14 Stoiko Ivanov
2025-07-23 18:46 ` [pve-devel] applied: " Thomas Lamprecht
2025-07-28 9:52 ` [pve-devel] " Friedrich Weber
2025-07-28 16:33 ` Stoiko Ivanov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250728183331.0e60f508@rosa.proxmox.com \
--to=s.ivanov@proxmox.com \
--cc=f.weber@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.