all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Friedrich Weber <f.weber@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	Stoiko Ivanov <s.ivanov@proxmox.com>
Subject: Re: [pve-devel] [PATCH zfsonlinux] cherry-pick fix for overgrown dnode cache
Date: Mon, 28 Jul 2025 11:52:23 +0200	[thread overview]
Message-ID: <5f3e46ed-bf99-45e2-b497-fc81dc50d9b3@proxmox.com> (raw)
In-Reply-To: <20250723181453.1082366-1-s.ivanov@proxmox.com>

On 23/07/2025 20:15, Stoiko Ivanov wrote:
> the following patch seems applicable and might fix an issue observed
> in our enterprise support a while ago. containers run in their own
> cgroups, thus were probably not scanned by the kernel shrinker - this
> resulted in Dnode cache numbers of 300+% reported in arc_summary.
> 
> FWICT the issue was introduced in ZFS 2.2.7
> (commit 5f73630e9cbea5efa23d16809f06e0d08523b241 see:
> https://github.com/openzfs/zfs/issues/17052#issuecomment-3065907783)
> but I assume that the increase of zfs_arc_max by default makes it
> trigger OOMs far easier.
> 
> The discussion of the PR was quite instructive:
> https://github.com/openzfs/zfs/pull/17542
> 
> minimally tested on a pair of trixie VMs (building + running
> replication of a couple of containers)
> 
> Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---

FWIW, tested this by setting up ZFS on root with an additional RAID-0
for guest disks, creating a Debian container and running a script [1] in it.

ARC size targets as reported by arc_summary:

        Target size (adaptive):                       100.0 %  794.0 MiB
        Min size (hard limit):                         31.3 %  248.2 MiB
        Max size (high water):                            3:1  794.0 MiB

With 6.14.8-2-pve (ZFS 2.3.3 without this patch), the Dnode Cache and
ARC grow considerably while the script is running and both stay like
this after the script has exited:

ARC size (current):                                   294.8 %    2.3 GiB
        Dnode cache target:                            10.0 %   79.4 MiB
        Dnode cache size:                            1181.9 %  938.5 MiB

Same on
- 6.8.12-13-pve (ZFS 2.2.8)
- 6.8.12-8-pve (ZFS 2.2.7)

With 6.8.12-6-pve (ZFS 2.2.6), the Dnode cache size still grows to >100%
and seems to stay there, but the ARC manages to stay below 100%:

ARC size (current):                                    96.8 %  768.9 MiB
        Dnode cache target:                            10.0 %   79.4 MiB
        Dnode cache size:                             333.6 %  264.9 MiB

With this patch on top of 6.8.12-9-pve, the Dnode Cache and ARC still
grow while the script is running, but both shrink again to <100% quickly
afterwards (within a minute or so):

ARC size (current):                                    30.9 %  245.3 MiB
        Dnode cache target:                            10.0 %   79.4 MiB
        Dnode cache size:                              99.0 %   78.6 MiB

We have an issue in enterprise support with a container-heavy workload
on ZFS 2.2.7 that is likely affected by this. However, they also saw
high Dnode cache size and >100% ARC on ZFS 2.2.6 -- the latter I
couldn't reproduce with ZFS 2.2.6, but perhaps I missed some factor.

Might be nice if we could make the fix available on PVE 8 as well,
though I'm not sure how easily this can be backported to ZFS 2.2.

[1]

#!/bin/bash
for h in $(seq 100); do
	(
	mkdir "dir-$h"
	cd "dir-$h" || exit 1
	for i in $(seq 100);
	do
		mkdir "$i"
		for j in $(seq 100);
		do
			echo test > "$i/$j.txt"
		done
	done
	)&
done




_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  parent reply	other threads:[~2025-07-28  9:51 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-23 18:14 Stoiko Ivanov
2025-07-23 18:46 ` [pve-devel] applied: " Thomas Lamprecht
2025-07-28  9:52 ` Friedrich Weber [this message]
2025-07-28 16:33   ` [pve-devel] " Stoiko Ivanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f3e46ed-bf99-45e2-b497-fc81dc50d9b3@proxmox.com \
    --to=f.weber@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=s.ivanov@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal