From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 115481FF187 for ; Mon, 28 Jul 2025 11:51:36 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 229871EE23; Mon, 28 Jul 2025 11:52:58 +0200 (CEST) Message-ID: <5f3e46ed-bf99-45e2-b497-fc81dc50d9b3@proxmox.com> Date: Mon, 28 Jul 2025 11:52:23 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Proxmox VE development discussion , Stoiko Ivanov References: <20250723181453.1082366-1-s.ivanov@proxmox.com> Content-Language: en-US From: Friedrich Weber In-Reply-To: <20250723181453.1082366-1-s.ivanov@proxmox.com> X-Bm-Milter-Handled: 55990f41-d878-4baa-be0a-ee34c49e34d2 X-Bm-Transport-Timestamp: 1753696335637 X-SPAM-LEVEL: Spam detection results: 0 AWL 0.012 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to Validity was blocked. See https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more information. SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pve-devel] [PATCH zfsonlinux] cherry-pick fix for overgrown dnode cache X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" On 23/07/2025 20:15, Stoiko Ivanov wrote: > the following patch seems applicable and might fix an issue observed > in our enterprise support a while ago. containers run in their own > cgroups, thus were probably not scanned by the kernel shrinker - this > resulted in Dnode cache numbers of 300+% reported in arc_summary. > > FWICT the issue was introduced in ZFS 2.2.7 > (commit 5f73630e9cbea5efa23d16809f06e0d08523b241 see: > https://github.com/openzfs/zfs/issues/17052#issuecomment-3065907783) > but I assume that the increase of zfs_arc_max by default makes it > trigger OOMs far easier. > > The discussion of the PR was quite instructive: > https://github.com/openzfs/zfs/pull/17542 > > minimally tested on a pair of trixie VMs (building + running > replication of a couple of containers) > > Suggested-by: Thomas Lamprecht > Signed-off-by: Stoiko Ivanov > --- FWIW, tested this by setting up ZFS on root with an additional RAID-0 for guest disks, creating a Debian container and running a script [1] in it. ARC size targets as reported by arc_summary: Target size (adaptive): 100.0 % 794.0 MiB Min size (hard limit): 31.3 % 248.2 MiB Max size (high water): 3:1 794.0 MiB With 6.14.8-2-pve (ZFS 2.3.3 without this patch), the Dnode Cache and ARC grow considerably while the script is running and both stay like this after the script has exited: ARC size (current): 294.8 % 2.3 GiB Dnode cache target: 10.0 % 79.4 MiB Dnode cache size: 1181.9 % 938.5 MiB Same on - 6.8.12-13-pve (ZFS 2.2.8) - 6.8.12-8-pve (ZFS 2.2.7) With 6.8.12-6-pve (ZFS 2.2.6), the Dnode cache size still grows to >100% and seems to stay there, but the ARC manages to stay below 100%: ARC size (current): 96.8 % 768.9 MiB Dnode cache target: 10.0 % 79.4 MiB Dnode cache size: 333.6 % 264.9 MiB With this patch on top of 6.8.12-9-pve, the Dnode Cache and ARC still grow while the script is running, but both shrink again to <100% quickly afterwards (within a minute or so): ARC size (current): 30.9 % 245.3 MiB Dnode cache target: 10.0 % 79.4 MiB Dnode cache size: 99.0 % 78.6 MiB We have an issue in enterprise support with a container-heavy workload on ZFS 2.2.7 that is likely affected by this. However, they also saw high Dnode cache size and >100% ARC on ZFS 2.2.6 -- the latter I couldn't reproduce with ZFS 2.2.6, but perhaps I missed some factor. Might be nice if we could make the fix available on PVE 8 as well, though I'm not sure how easily this can be backported to ZFS 2.2. [1] #!/bin/bash for h in $(seq 100); do ( mkdir "dir-$h" cd "dir-$h" || exit 1 for i in $(seq 100); do mkdir "$i" for j in $(seq 100); do echo test > "$i/$j.txt" done done )& done _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel