public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Friedrich Weber <f.weber@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	Stoiko Ivanov <s.ivanov@proxmox.com>
Subject: Re: [pve-devel] [PATCH zfsonlinux] cherry-pick fix for overgrown dnode cache
Date: Mon, 28 Jul 2025 11:52:23 +0200	[thread overview]
Message-ID: <5f3e46ed-bf99-45e2-b497-fc81dc50d9b3@proxmox.com> (raw)
In-Reply-To: <20250723181453.1082366-1-s.ivanov@proxmox.com>

On 23/07/2025 20:15, Stoiko Ivanov wrote:
> the following patch seems applicable and might fix an issue observed
> in our enterprise support a while ago. containers run in their own
> cgroups, thus were probably not scanned by the kernel shrinker - this
> resulted in Dnode cache numbers of 300+% reported in arc_summary.
> 
> FWICT the issue was introduced in ZFS 2.2.7
> (commit 5f73630e9cbea5efa23d16809f06e0d08523b241 see:
> https://github.com/openzfs/zfs/issues/17052#issuecomment-3065907783)
> but I assume that the increase of zfs_arc_max by default makes it
> trigger OOMs far easier.
> 
> The discussion of the PR was quite instructive:
> https://github.com/openzfs/zfs/pull/17542
> 
> minimally tested on a pair of trixie VMs (building + running
> replication of a couple of containers)
> 
> Suggested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
> Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
> ---

FWIW, tested this by setting up ZFS on root with an additional RAID-0
for guest disks, creating a Debian container and running a script [1] in it.

ARC size targets as reported by arc_summary:

        Target size (adaptive):                       100.0 %  794.0 MiB
        Min size (hard limit):                         31.3 %  248.2 MiB
        Max size (high water):                            3:1  794.0 MiB

With 6.14.8-2-pve (ZFS 2.3.3 without this patch), the Dnode Cache and
ARC grow considerably while the script is running and both stay like
this after the script has exited:

ARC size (current):                                   294.8 %    2.3 GiB
        Dnode cache target:                            10.0 %   79.4 MiB
        Dnode cache size:                            1181.9 %  938.5 MiB

Same on
- 6.8.12-13-pve (ZFS 2.2.8)
- 6.8.12-8-pve (ZFS 2.2.7)

With 6.8.12-6-pve (ZFS 2.2.6), the Dnode cache size still grows to >100%
and seems to stay there, but the ARC manages to stay below 100%:

ARC size (current):                                    96.8 %  768.9 MiB
        Dnode cache target:                            10.0 %   79.4 MiB
        Dnode cache size:                             333.6 %  264.9 MiB

With this patch on top of 6.8.12-9-pve, the Dnode Cache and ARC still
grow while the script is running, but both shrink again to <100% quickly
afterwards (within a minute or so):

ARC size (current):                                    30.9 %  245.3 MiB
        Dnode cache target:                            10.0 %   79.4 MiB
        Dnode cache size:                              99.0 %   78.6 MiB

We have an issue in enterprise support with a container-heavy workload
on ZFS 2.2.7 that is likely affected by this. However, they also saw
high Dnode cache size and >100% ARC on ZFS 2.2.6 -- the latter I
couldn't reproduce with ZFS 2.2.6, but perhaps I missed some factor.

Might be nice if we could make the fix available on PVE 8 as well,
though I'm not sure how easily this can be backported to ZFS 2.2.

[1]

#!/bin/bash
for h in $(seq 100); do
	(
	mkdir "dir-$h"
	cd "dir-$h" || exit 1
	for i in $(seq 100);
	do
		mkdir "$i"
		for j in $(seq 100);
		do
			echo test > "$i/$j.txt"
		done
	done
	)&
done




_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


  parent reply	other threads:[~2025-07-28  9:51 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-23 18:14 Stoiko Ivanov
2025-07-23 18:46 ` [pve-devel] applied: " Thomas Lamprecht
2025-07-28  9:52 ` Friedrich Weber [this message]
2025-07-28 16:33   ` [pve-devel] " Stoiko Ivanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f3e46ed-bf99-45e2-b497-fc81dc50d9b3@proxmox.com \
    --to=f.weber@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=s.ivanov@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal