From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: Friedrich Weber <f.weber@proxmox.com>
Cc: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH zfsonlinux] cherry-pick fix for overgrown dnode cache
Date: Mon, 28 Jul 2025 18:33:31 +0200 [thread overview]
Message-ID: <20250728183331.0e60f508@rosa.proxmox.com> (raw)
In-Reply-To: <5f3e46ed-bf99-45e2-b497-fc81dc50d9b3@proxmox.com>
On Mon, 28 Jul 2025 11:52:23 +0200
Friedrich Weber <f.weber@proxmox.com> wrote:
> On 23/07/2025 20:15, Stoiko Ivanov wrote:
> > the following patch seems applicable and might fix an issue observed
> > in our enterprise support a while ago. containers run in their own
> > cgroups, thus were probably not scanned by the kernel shrinker - this
> > resulted in Dnode cache numbers of 300+% reported in arc_summary.
> >
> > FWICT the issue was introduced in ZFS 2.2.7
> > (commit 5f73630e9cbea5efa23d16809f06e0d08523b241 see:
> > https://github.com/openzfs/zfs/issues/17052#issuecomment-3065907783)
> > but I assume that the increase of zfs_arc_max by default makes it
> > trigger OOMs far easier.
> >
> > The discussion of the PR was quite instructive:
> > https://github.com/openzfs/zfs/pull/17542
> >
> > minimally tested on a pair of trixie VMs (building + running
> > replication of a couple of containers)
> >
> > Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
> > Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> > ---
>
> FWIW, tested this by setting up ZFS on root with an additional RAID-0
> for guest disks, creating a Debian container and running a script [1] in it.
>
> ARC size targets as reported by arc_summary:
>
> Target size (adaptive): 100.0 % 794.0 MiB
> Min size (hard limit): 31.3 % 248.2 MiB
> Max size (high water): 3:1 794.0 MiB
>
> With 6.14.8-2-pve (ZFS 2.3.3 without this patch), the Dnode Cache and
> ARC grow considerably while the script is running and both stay like
> this after the script has exited:
>
> ARC size (current): 294.8 % 2.3 GiB
> Dnode cache target: 10.0 % 79.4 MiB
> Dnode cache size: 1181.9 % 938.5 MiB
>
> Same on
> - 6.8.12-13-pve (ZFS 2.2.8)
> - 6.8.12-8-pve (ZFS 2.2.7)
>
> With 6.8.12-6-pve (ZFS 2.2.6), the Dnode cache size still grows to >100%
> and seems to stay there, but the ARC manages to stay below 100%:
>
> ARC size (current): 96.8 % 768.9 MiB
> Dnode cache target: 10.0 % 79.4 MiB
> Dnode cache size: 333.6 % 264.9 MiB
>
> With this patch on top of 6.8.12-9-pve, the Dnode Cache and ARC still
> grow while the script is running, but both shrink again to <100% quickly
> afterwards (within a minute or so):
>
> ARC size (current): 30.9 % 245.3 MiB
> Dnode cache target: 10.0 % 79.4 MiB
> Dnode cache size: 99.0 % 78.6 MiB
>
> We have an issue in enterprise support with a container-heavy workload
> on ZFS 2.2.7 that is likely affected by this. However, they also saw
> high Dnode cache size and >100% ARC on ZFS 2.2.6 -- the latter I
> couldn't reproduce with ZFS 2.2.6, but perhaps I missed some factor.
Thanks for the tests, short reproducer and feedback!
>
> Might be nice if we could make the fix available on PVE 8 as well,
> though I'm not sure how easily this can be backported to ZFS 2.2.
sent my attempt at a backport:
https://lore.proxmox.com/pve-devel/20250728163041.1287899-1-s.ivanov@proxmox.com/T/#u
>
> [1]
>
> #!/bin/bash
> for h in $(seq 100); do
> (
> mkdir "dir-$h"
> cd "dir-$h" || exit 1
> for i in $(seq 100);
> do
> mkdir "$i"
> for j in $(seq 100);
> do
> echo test > "$i/$j.txt"
> done
> done
> )&
> done
>
>
_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
prev parent reply other threads:[~2025-07-28 16:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-23 18:14 Stoiko Ivanov
2025-07-23 18:46 ` [pve-devel] applied: " Thomas Lamprecht
2025-07-28 9:52 ` [pve-devel] " Friedrich Weber
2025-07-28 16:33 ` Stoiko Ivanov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250728183331.0e60f508@rosa.proxmox.com \
--to=s.ivanov@proxmox.com \
--cc=f.weber@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox