public inbox for pve-user@lists.proxmox.com
 help / color / mirror / Atom feed
* [PVE-User] A less aggressive OOM?
@ 2025-07-07  9:26 Marco Gaiarin
  2025-07-07 21:39 ` Victor Rodriguez
  2025-07-08 12:05 ` Roland via pve-user
  0 siblings, 2 replies; 10+ messages in thread
From: Marco Gaiarin @ 2025-07-07  9:26 UTC (permalink / raw)
  To: pve-user


We have upgraded a set of clusters from PVE6 to PVE8, and we have found that
in newer kernels, OOM is a bit more 'aggressive' and sometime kill a VMs.

Nodes have plently of RAM (64GB, VMs are 2-3, each 8GB ram), VMs have qemu
agent installed and ballooning enabled, but still sometime OOM happen.
Clearly, if get OOM the main VMs that have the local DNS, we get some
trouble.


I've looked in PVE wiki, but found nothing. There's some way to relax OOM,
or control their behaviour?

In nodes there's no swap, so probably the best thing to do (but the hardest
one ;-) is to setup some swap with a lower swappiness, but i'm seeking
feedback.


Thanks.

-- 



_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-07  9:26 [PVE-User] A less aggressive OOM? Marco Gaiarin
@ 2025-07-07 21:39 ` Victor Rodriguez
  2025-07-08 16:31   ` Marco Gaiarin
  2025-07-08 12:05 ` Roland via pve-user
  1 sibling, 1 reply; 10+ messages in thread
From: Victor Rodriguez @ 2025-07-07 21:39 UTC (permalink / raw)
  To: Proxmox VE user list, Marco Gaiarin

Hi,

I would start by analyzing the memory status at the time of the OOM. 
There should be a some lines in journal/syslog were the kernel writes 
what the memory looked like and you can figure out why it had to kill a 
process.

Makes few sense that OOM triggers in 64GB hosts with just 24GB 
configured in VMs and, probably, less real usage. IMHO it's not VMs what 
fill your memory up to the point of OOM, but some other process, ZFS 
ARC, maybe even some mem leak. Maybe some process is producing severe 
memory fragmentation.

Regards,



On 7/7/25 11:26, Marco Gaiarin wrote:
> We have upgraded a set of clusters from PVE6 to PVE8, and we have found that
> in newer kernels, OOM is a bit more 'aggressive' and sometime kill a VMs.
>
> Nodes have plently of RAM (64GB, VMs are 2-3, each 8GB ram), VMs have qemu
> agent installed and ballooning enabled, but still sometime OOM happen.
> Clearly, if get OOM the main VMs that have the local DNS, we get some
> trouble.
>
>
> I've looked in PVE wiki, but found nothing. There's some way to relax OOM,
> or control their behaviour?
>
> In nodes there's no swap, so probably the best thing to do (but the hardest
> one ;-) is to setup some swap with a lower swappiness, but i'm seeking
> feedback.
>
>
> Thanks.
>
-- 


_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-07  9:26 [PVE-User] A less aggressive OOM? Marco Gaiarin
  2025-07-07 21:39 ` Victor Rodriguez
@ 2025-07-08 12:05 ` Roland via pve-user
  1 sibling, 0 replies; 10+ messages in thread
From: Roland via pve-user @ 2025-07-08 12:05 UTC (permalink / raw)
  To: Proxmox VE user list; +Cc: Roland

[-- Attachment #1: Type: message/rfc822, Size: 9044 bytes --]

From: Roland <devzero@web.de>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] A less aggressive OOM?
Date: Tue, 8 Jul 2025 14:05:54 +0200
Message-ID: <6b15b452-0fc6-41ee-a1f7-34cd7943ab38@web.de>

hi,

it's a little bit weird that OOM kicks in with VMs <32GB RAM when you 
have 64GB

take a closer look why this happens , i.e. why OOM thinks there is ram 
pressure

roland

Am 07.07.25 um 11:26 schrieb Marco Gaiarin:
> We have upgraded a set of clusters from PVE6 to PVE8, and we have found that
> in newer kernels, OOM is a bit more 'aggressive' and sometime kill a VMs.
>
> Nodes have plently of RAM (64GB, VMs are 2-3, each 8GB ram), VMs have qemu
> agent installed and ballooning enabled, but still sometime OOM happen.
> Clearly, if get OOM the main VMs that have the local DNS, we get some
> trouble.
>
>
> I've looked in PVE wiki, but found nothing. There's some way to relax OOM,
> or control their behaviour?
>
> In nodes there's no swap, so probably the best thing to do (but the hardest
> one ;-) is to setup some swap with a lower swappiness, but i'm seeking
> feedback.
>
>
> Thanks.
>

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-07 21:39 ` Victor Rodriguez
@ 2025-07-08 16:31   ` Marco Gaiarin
  2025-07-10  8:56     ` Victor Rodriguez
  0 siblings, 1 reply; 10+ messages in thread
From: Marco Gaiarin @ 2025-07-08 16:31 UTC (permalink / raw)
  To: Victor Rodriguez, Roland; +Cc: Proxmox VE user list

Mandi! Victor Rodriguez
  In chel di` si favelave...

> I would start by analyzing the memory status at the time of the OOM. There
> should be a some lines in journal/syslog were the kernel writes what the
> memory looked like and you can figure out why it had to kill a process.

This is the full OOM log:

Jul  4 20:00:12 pppve1 kernel: [3375931.660119] kvm invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
Jul  4 20:00:12 pppve1 kernel: [3375931.669158] CPU: 1 PID: 4088 Comm: kvm Tainted: P           O       6.8.12-10-pve #1
Jul  4 20:00:12 pppve1 kernel: [3375931.677778] Hardware name: Dell Inc. PowerEdge T440/021KCD, BIOS 2.24.0 04/02/2025
Jul  4 20:00:12 pppve1 kernel: [3375931.686211] Call Trace:
Jul  4 20:00:12 pppve1 kernel: [3375931.689504]  <TASK>
Jul  4 20:00:12 pppve1 kernel: [3375931.692428]  dump_stack_lvl+0x76/0xa0
Jul  4 20:00:12 pppve1 kernel: [3375931.696915]  dump_stack+0x10/0x20
Jul  4 20:00:12 pppve1 kernel: [3375931.701057]  dump_header+0x47/0x1f0
Jul  4 20:00:12 pppve1 kernel: [3375931.705358]  oom_kill_process+0x110/0x240
Jul  4 20:00:12 pppve1 kernel: [3375931.710169]  out_of_memory+0x26e/0x560
Jul  4 20:00:12 pppve1 kernel: [3375931.714707]  __alloc_pages+0x10ce/0x1320
Jul  4 20:00:12 pppve1 kernel: [3375931.719422]  alloc_pages_mpol+0x91/0x1f0
Jul  4 20:00:12 pppve1 kernel: [3375931.724136]  alloc_pages+0x54/0xb0
Jul  4 20:00:12 pppve1 kernel: [3375931.728320]  __get_free_pages+0x11/0x50
Jul  4 20:00:12 pppve1 kernel: [3375931.732938]  __pollwait+0x9e/0xe0
Jul  4 20:00:12 pppve1 kernel: [3375931.737015]  eventfd_poll+0x2c/0x70
Jul  4 20:00:12 pppve1 kernel: [3375931.741261]  do_sys_poll+0x2f4/0x610
Jul  4 20:00:12 pppve1 kernel: [3375931.745587]  ? __pfx___pollwait+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.750332]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.754900]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.759463]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.764011]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.768617]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.773165]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.777688]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.782156]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.786622]  ? __pfx_pollwake+0x10/0x10
Jul  4 20:00:12 pppve1 kernel: [3375931.791111]  __x64_sys_ppoll+0xde/0x170
Jul  4 20:00:12 pppve1 kernel: [3375931.795656]  x64_sys_call+0x1818/0x2480
Jul  4 20:00:12 pppve1 kernel: [3375931.800193]  do_syscall_64+0x81/0x170
Jul  4 20:00:12 pppve1 kernel: [3375931.804485]  ? __x64_sys_ppoll+0xf2/0x170
Jul  4 20:00:12 pppve1 kernel: [3375931.809100]  ? syscall_exit_to_user_mode+0x86/0x260
Jul  4 20:00:12 pppve1 kernel: [3375931.814566]  ? do_syscall_64+0x8d/0x170
Jul  4 20:00:12 pppve1 kernel: [3375931.818979]  ? syscall_exit_to_user_mode+0x86/0x260
Jul  4 20:00:12 pppve1 kernel: [3375931.824425]  ? do_syscall_64+0x8d/0x170
Jul  4 20:00:12 pppve1 kernel: [3375931.828825]  ? clear_bhb_loop+0x15/0x70
Jul  4 20:00:12 pppve1 kernel: [3375931.833211]  ? clear_bhb_loop+0x15/0x70
Jul  4 20:00:12 pppve1 kernel: [3375931.837579]  ? clear_bhb_loop+0x15/0x70
Jul  4 20:00:12 pppve1 kernel: [3375931.841928]  entry_SYSCALL_64_after_hwframe+0x78/0x80
Jul  4 20:00:12 pppve1 kernel: [3375931.847482] RIP: 0033:0x765bb1ce8316
Jul  4 20:00:12 pppve1 kernel: [3375931.851577] Code: 7c 24 08 e8 2c 95 f8 ff 4c 8b 54 24 18 48 8b 74 24 10 41 b8 08 00 00 00 41 89 c1 48 8b 7c 24 08 4c 89 e2 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 44 89 cf 89 44 24 08 e8 76 95 f8 ff 8b 44
Jul  4 20:00:12 pppve1 kernel: [3375931.871194] RSP: 002b:00007fff2d39ea20 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
Jul  4 20:00:12 pppve1 kernel: [3375931.879298] RAX: ffffffffffffffda RBX: 00006045d3e68470 RCX: 0000765bb1ce8316
Jul  4 20:00:12 pppve1 kernel: [3375931.886963] RDX: 00007fff2d39ea40 RSI: 0000000000000010 RDI: 00006045d4de5f20
Jul  4 20:00:12 pppve1 kernel: [3375931.894630] RBP: 00007fff2d39eaac R08: 0000000000000008 R09: 0000000000000000
Jul  4 20:00:12 pppve1 kernel: [3375931.902299] R10: 0000000000000000 R11: 0000000000000293 R12: 00007fff2d39ea40
Jul  4 20:00:12 pppve1 kernel: [3375931.909951] R13: 00006045d3e68470 R14: 00006045b014d570 R15: 00007fff2d39eab0
Jul  4 20:00:12 pppve1 kernel: [3375931.917656]  </TASK>
Jul  4 20:00:12 pppve1 kernel: [3375931.920515] Mem-Info:
Jul  4 20:00:12 pppve1 kernel: [3375931.923465] active_anon:4467063 inactive_anon:2449638 isolated_anon:0
Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  active_file:611 inactive_file:303 isolated_file:0
Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  unevictable:39551 dirty:83 writeback:237
Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  slab_reclaimable:434580 slab_unreclaimable:1792355
Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  mapped:571491 shmem:581427 pagetables:26365
Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  sec_pagetables:11751 bounce:0
Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  kernel_misc_reclaimable:0
Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  free:234516 free_pcp:5874 free_cma:0
Jul  4 20:00:12 pppve1 kernel: [3375931.969518] Node 0 active_anon:17033436kB inactive_anon:10633368kB active_file:64kB inactive_file:3196kB unevictable:158204kB isolated(anon):0kB isolated(file):0kB mapped:2285988kB dirty:356kB writeback:948kB shmem:2325708kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:866304kB writeback_tmp:0kB kernel_stack:11520kB pagetables:105460kB sec_pagetables:47004kB all_unreclaimable? no
Jul  4 20:00:12 pppve1 kernel: [3375932.004977] Node 0 DMA free:11264kB boost:0kB min:12kB low:24kB high:36kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Jul  4 20:00:12 pppve1 kernel: [3375932.032646] lowmem_reserve[]: 0 1527 63844 63844 63844
Jul  4 20:00:12 pppve1 kernel: [3375932.038675] Node 0 DMA32 free:252428kB boost:0kB min:1616kB low:3176kB high:4736kB reserved_highatomic:2048KB active_anon:310080kB inactive_anon:986436kB active_file:216kB inactive_file:0kB unevictable:0kB writepending:0kB present:1690624kB managed:1623508kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Jul  4 20:00:12 pppve1 kernel: [3375932.069110] lowmem_reserve[]: 0 0 62317 62317 62317
Jul  4 20:00:12 pppve1 kernel: [3375932.074979] Node 0 Normal free:814396kB boost:290356kB min:356304kB low:420116kB high:483928kB reserved_highatomic:346112KB active_anon:11258684kB inactive_anon:15111580kB active_file:0kB inactive_file:2316kB unevictable:158204kB writepending:1304kB present:65011712kB managed:63820796kB mlocked:155132kB bounce:0kB free_pcp:12728kB local_pcp:0kB free_cma:0kB
Jul  4 20:00:12 pppve1 kernel: [3375932.109188] lowmem_reserve[]: 0 0 0 0 0
Jul  4 20:00:12 pppve1 kernel: [3375932.114119] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11264kB
Jul  4 20:00:12 pppve1 kernel: [3375932.127796] Node 0 DMA32: 5689*4kB (UMH) 1658*8kB (UMH) 381*16kB (UM) 114*32kB (UME) 97*64kB (UME) 123*128kB (UMEH) 87*256kB (MEH) 96*512kB (UMEH) 58*1024kB (UME) 5*2048kB (UME) 11*4096kB (ME) = 253828kB
Jul  4 20:00:12 pppve1 kernel: [3375932.148050] Node 0 Normal: 16080*4kB (UMEH) 36886*8kB (UMEH) 22890*16kB (UMEH) 4687*32kB (MEH) 159*64kB (UMEH) 10*128kB (UE) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 887088kB
Jul  4 20:00:12 pppve1 kernel: [3375932.165899] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jul  4 20:00:12 pppve1 kernel: [3375932.175876] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jul  4 20:00:12 pppve1 kernel: [3375932.185569] 586677 total pagecache pages
Jul  4 20:00:12 pppve1 kernel: [3375932.190737] 0 pages in swap cache
Jul  4 20:00:12 pppve1 kernel: [3375932.195285] Free swap  = 0kB
Jul  4 20:00:12 pppve1 kernel: [3375932.199404] Total swap = 0kB
Jul  4 20:00:12 pppve1 kernel: [3375932.203513] 16679583 pages RAM
Jul  4 20:00:12 pppve1 kernel: [3375932.207787] 0 pages HighMem/MovableOnly
Jul  4 20:00:12 pppve1 kernel: [3375932.212819] 314667 pages reserved
Jul  4 20:00:12 pppve1 kernel: [3375932.217321] 0 pages hwpoisoned
Jul  4 20:00:12 pppve1 kernel: [3375932.221525] Tasks state (memory values in pages):
Jul  4 20:00:12 pppve1 kernel: [3375932.227400] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Jul  4 20:00:12 pppve1 kernel: [3375932.239680] [   1959]   106  1959     1971      544       96      448         0    61440        0             0 rpcbind
Jul  4 20:00:12 pppve1 kernel: [3375932.251672] [   1982]   104  1982     2350      672      160      512         0    57344        0          -900 dbus-daemon
Jul  4 20:00:12 pppve1 kernel: [3375932.264020] [   1991]     0  1991     1767      275       83      192         0    57344        0             0 ksmtuned
Jul  4 20:00:12 pppve1 kernel: [3375932.276094] [   1995]     0  1995    69541      480       64      416         0    86016        0             0 pve-lxc-syscall
Jul  4 20:00:12 pppve1 kernel: [3375932.289686] [   2002]     0  2002     1330      384       32      352         0    53248        0             0 qmeventd
Jul  4 20:00:12 pppve1 kernel: [3375932.301811] [   2003]     0  2003    55449      727      247      480         0    86016        0             0 rsyslogd
Jul  4 20:00:12 pppve1 kernel: [3375932.313919] [   2004]     0  2004     3008      928      448      480         0    69632        0             0 smartd
Jul  4 20:00:12 pppve1 kernel: [3375932.325834] [   2009]     0  2009     6386      992      224      768         0    77824        0             0 systemd-logind
Jul  4 20:00:12 pppve1 kernel: [3375932.339438] [   2010]     0  2010      584      256        0      256         0    40960        0         -1000 watchdog-mux
Jul  4 20:00:12 pppve1 kernel: [3375932.352936] [   2021]     0  2021    60174      928      256      672         0    90112        0             0 zed
Jul  4 20:00:12 pppve1 kernel: [3375932.364626] [   2136]     0  2136    75573      256       64      192         0    86016        0         -1000 lxcfs
Jul  4 20:00:12 pppve1 kernel: [3375932.376485] [   2397]     0  2397     2208      480       64      416         0    61440        0             0 lxc-monitord
Jul  4 20:00:12 pppve1 kernel: [3375932.389169] [   2421]     0  2421    40673      454       70      384         0    73728        0             0 apcupsd
Jul  4 20:00:12 pppve1 kernel: [3375932.400685] [   2426]     0  2426     3338      428      172      256         0    69632        0             0 iscsid
Jul  4 20:00:12 pppve1 kernel: [3375932.412121] [   2427]     0  2427     3464     3343      431     2912         0    77824        0           -17 iscsid
Jul  4 20:00:12 pppve1 kernel: [3375932.423754] [   2433]     0  2433     3860     1792      320     1472         0    77824        0         -1000 sshd
Jul  4 20:00:12 pppve1 kernel: [3375932.435208] [   2461]     0  2461   189627     2688     1344     1344         0   155648        0             0 dsm_ism_srvmgrd
Jul  4 20:00:12 pppve1 kernel: [3375932.448290] [   2490]   113  2490     4721      750      142      608         0    61440        0             0 chronyd
Jul  4 20:00:12 pppve1 kernel: [3375932.459988] [   2492]   113  2492     2639      502      118      384         0    61440        0             0 chronyd
Jul  4 20:00:12 pppve1 kernel: [3375932.471684] [   2531]     0  2531     1469      448       32      416         0    49152        0             0 agetty
Jul  4 20:00:12 pppve1 kernel: [3375932.483269] [   2555]     0  2555   126545      673      244      429         0   147456        0             0 rrdcached
Jul  4 20:00:12 pppve1 kernel: [3375932.483275] [   2582]     0  2582   155008    15334     3093      864     11377   434176        0             0 pmxcfs
Jul  4 20:00:12 pppve1 kernel: [3375932.506653] [   2654]     0  2654    10667      614      134      480         0    77824        0             0 master
Jul  4 20:00:12 pppve1 kernel: [3375932.517986] [   2656]   107  2656    10812      704      160      544         0    73728        0             0 qmgr
Jul  4 20:00:12 pppve1 kernel: [3375932.529118] [   2661]     0  2661   139892    41669    28417     2980     10272   405504        0             0 corosync
Jul  4 20:00:12 pppve1 kernel: [3375932.540553] [   2662]     0  2662     1653      576       32      544         0    53248        0             0 cron
Jul  4 20:00:12 pppve1 kernel: [3375932.551657] [   2664]     0  2664     1621      480       96      384         0    57344        0             0 proxmox-firewal
Jul  4 20:00:12 pppve1 kernel: [3375932.564093] [   3164]     0  3164    83332    26227    25203      768       256   360448        0             0 pve-firewall
Jul  4 20:00:12 pppve1 kernel: [3375932.576192] [   3233]     0  3233    85947    28810    27242     1216       352   385024        0             0 pvestatd
Jul  4 20:00:12 pppve1 kernel: [3375932.587638] [   3417]     0  3417    93674    36011    35531      480         0   438272        0             0 pvedaemon
Jul  4 20:00:12 pppve1 kernel: [3375932.599167] [   3421]     0  3421    95913    37068    35884     1120        64   454656        0             0 pvedaemon worke
Jul  4 20:00:12 pppve1 kernel: [3375932.611536] [   3424]     0  3424    96072    36972    35852     1088        32   454656        0             0 pvedaemon worke
Jul  4 20:00:12 pppve1 kernel: [3375932.623977] [   3426]     0  3426    96167    37068    35948     1056        64   458752        0             0 pvedaemon worke
Jul  4 20:00:12 pppve1 kernel: [3375932.636698] [   3558]     0  3558    90342    29540    28676      608       256   385024        0             0 pve-ha-crm
Jul  4 20:00:12 pppve1 kernel: [3375932.648477] [   3948]    33  3948    94022    37705    35849     1856         0   471040        0             0 pveproxy
Jul  4 20:00:12 pppve1 kernel: [3375932.660083] [   3954]    33  3954    21688    14368    12736     1632         0   221184        0             0 spiceproxy
Jul  4 20:00:12 pppve1 kernel: [3375932.671862] [   3956]     0  3956    90222    29321    28521      544       256   397312        0             0 pve-ha-lrm
Jul  4 20:00:12 pppve1 kernel: [3375932.683484] [   3994]     0  3994  1290140   706601   705993      608         0  6389760        0             0 kvm
Jul  4 20:00:12 pppve1 kernel: [3375932.694551] [   4088]     0  4088  1271416  1040767  1040223      544         0  8994816        0             0 kvm
Jul  4 20:00:12 pppve1 kernel: [3375932.705624] [   4160]     0  4160    89394    30149    29541      608         0   380928        0             0 pvescheduler
Jul  4 20:00:12 pppve1 kernel: [3375932.717864] [   4710]     0  4710     1375      480       32      448         0    57344        0             0 agetty
Jul  4 20:00:12 pppve1 kernel: [3375932.729183] [   5531]     0  5531   993913   567351   566647      704         0  5611520        0             0 kvm
Jul  4 20:00:12 pppve1 kernel: [3375932.740212] [   6368]     0  6368  5512483  4229046  4228342      704         0 34951168        0             0 kvm
Jul  4 20:00:12 pppve1 kernel: [3375932.751255] [   9796]     0  9796     1941      768       64      704         0    57344        0             0 lxc-start
Jul  4 20:00:12 pppve1 kernel: [3375932.762840] [   9808] 100000  9808     3875      160       32      128         0    77824        0             0 init
Jul  4 20:00:12 pppve1 kernel: [3375932.774063] [  11447] 100000 11447     9272      192       64      128         0   118784        0             0 rpcbind
Jul  4 20:00:12 pppve1 kernel: [3375932.785534] [  11620] 100000 11620    45718      240      112      128         0   126976        0             0 rsyslogd
Jul  4 20:00:12 pppve1 kernel: [3375932.797241] [  11673] 100000 11673     4758      195       35      160         0    81920        0             0 atd
Jul  4 20:00:12 pppve1 kernel: [3375932.808516] [  11748] 100000 11748     6878      228       36      192         0    98304        0             0 cron
Jul  4 20:00:12 pppve1 kernel: [3375932.819868] [  11759] 100102 11759    10533      257       65      192         0   122880        0             0 dbus-daemon
Jul  4 20:00:12 pppve1 kernel: [3375932.832328] [  11765] 100000 11765    13797      315      155      160         0   143360        0             0 sshd
Jul  4 20:00:12 pppve1 kernel: [3375932.843547] [  11989] 100104 11989   565602    19744      288      160     19296   372736        0             0 postgres
Jul  4 20:00:12 pppve1 kernel: [3375932.855266] [  12169] 100104 12169   565938   537254      678      192    536384  4517888        0             0 postgres
Jul  4 20:00:12 pppve1 kernel: [3375932.866950] [  12170] 100104 12170   565859   199654      550      224    198880  4296704        0             0 postgres
Jul  4 20:00:12 pppve1 kernel: [3375932.878525] [  12171] 100104 12171   565859     4710      358      224      4128   241664        0             0 postgres
Jul  4 20:00:12 pppve1 kernel: [3375932.890252] [  12172] 100104 12172   565962     7654      518      192      6944   827392        0             0 postgres
Jul  4 20:00:12 pppve1 kernel: [3375932.901845] [  12173] 100104 12173    20982      742      518      224         0   200704        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375932.913421] [  13520] 100000 13520     9045      192      128       64         0   114688        0             0 master
Jul  4 20:00:13 pppve1 kernel: [3375932.924809] [  13536] 100100 13536     9601      320      128      192         0   126976        0             0 qmgr
Jul  4 20:00:13 pppve1 kernel: [3375932.936088] [  13547] 100000 13547     3168      192       32      160         0    73728        0             0 getty
Jul  4 20:00:13 pppve1 kernel: [3375932.947424] [  13548] 100000 13548     3168      160       32      128         0    73728        0             0 getty
Jul  4 20:00:13 pppve1 kernel: [3375932.958761] [1302486]     0 1302486     1941      768       96      672         0    53248        0             0 lxc-start
Jul  4 20:00:13 pppve1 kernel: [3375932.970490] [1302506] 100000 1302506     2115      128       32       96         0    65536        0             0 init
Jul  4 20:00:13 pppve1 kernel: [3375932.981999] [1302829] 100001 1302829     2081      128        0      128         0    61440        0             0 portmap
Jul  4 20:00:13 pppve1 kernel: [3375932.993763] [1302902] 100000 1302902    27413      160       64       96         0   122880        0             0 rsyslogd
Jul  4 20:00:13 pppve1 kernel: [3375933.005719] [1302953] 100000 1302953   117996     1654     1366      227        61   450560        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.017459] [1302989] 100000 1302989     4736       97       33       64         0    81920        0             0 atd
Jul  4 20:00:13 pppve1 kernel: [3375933.028905] [1303004] 100104 1303004     5843       64       32       32         0    94208        0             0 dbus-daemon
Jul  4 20:00:13 pppve1 kernel: [3375933.041272] [1303030] 100000 1303030    12322      334      110      224         0   139264        0             0 sshd
Jul  4 20:00:13 pppve1 kernel: [3375933.052755] [1303048] 100000 1303048     5664       64       32       32         0    94208        0             0 cron
Jul  4 20:00:13 pppve1 kernel: [3375933.064220] [1303255] 100000 1303255     9322      224       96      128         0   118784        0             0 master
Jul  4 20:00:13 pppve1 kernel: [3375933.075896] [1303284] 100101 1303284     9878      352      128      224         0   122880        0             0 qmgr
Jul  4 20:00:13 pppve1 kernel: [3375933.087405] [1303285] 100000 1303285     1509       32        0       32         0    61440        0             0 getty
Jul  4 20:00:13 pppve1 kernel: [3375933.099008] [1303286] 100000 1303286     1509       64        0       64         0    61440        0             0 getty
Jul  4 20:00:13 pppve1 kernel: [3375933.110571] [1420994]    33 1420994    21749    13271    12759      512         0   204800        0             0 spiceproxy work
Jul  4 20:00:13 pppve1 kernel: [3375933.123378] [1421001]    33 1421001    94055    37044    35892     1152         0   434176        0             0 pveproxy worker
Jul  4 20:00:13 pppve1 kernel: [3375933.136284] [1421002]    33 1421002    94055    36980    35860     1120         0   434176        0             0 pveproxy worker
Jul  4 20:00:13 pppve1 kernel: [3375933.149173] [1421003]    33 1421003    94055    37044    35892     1152         0   434176        0             0 pveproxy worker
Jul  4 20:00:13 pppve1 kernel: [3375933.162040] [2316827]     0 2316827     6820     1088      224      864         0    69632        0         -1000 systemd-udevd
Jul  4 20:00:13 pppve1 kernel: [3375933.174778] [2316923]     0 2316923    51282     2240      224     2016         0   438272        0          -250 systemd-journal
Jul  4 20:00:13 pppve1 kernel: [3375933.187768] [3148356]     0 3148356    32681    21120    19232     1888         0   249856        0             0 glpi-agent (tag
Jul  4 20:00:13 pppve1 kernel: [3375933.200481] [3053571]     0 3053571    19798      480       32      448         0    57344        0             0 pvefw-logger
Jul  4 20:00:13 pppve1 kernel: [3375933.212970] [3498513] 100033 3498513   119792     7207     2632      223      4352   516096        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.224713] [3498820] 100104 3498820   575918   235975     9351      160    226464  3424256        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.236579] [3500997] 100033 3500997   119889     7202     2594      192      4416   524288        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.248240] [3501657] 100104 3501657   571325   199025     6001      160    192864  2945024        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.260100] [3502514] 100033 3502514   119119     5907     2004      191      3712   503808        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.271772] [3503679] 100104 3503679   575295   211508     6612      192    204704  2953216        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.283619] [3515234] 100033 3515234   119042     6568     1960      192      4416   503808        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.295362] [3515420] 100104 3515420   569839    97579     4491      160     92928  2293760        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.307155] [3520282] 100033 3520282   119129     5416     2056      192      3168   495616        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.318923] [3520287] 100033 3520287   119015     5709     1894      167      3648   503808        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.330805] [3520288] 100033 3520288   119876     5961     2729      224      3008   507904        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.342648] [3521057] 100104 3521057   573824    46069     8341      128     37600  1830912        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.354567] [3521067] 100104 3521067   574768    99734     7446       96     92192  2134016        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.366512] [3521301] 100104 3521301   569500   174722     4194      160    170368  2482176        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.378484] [3532810] 100033 3532810   118740     4127     1727      160      2240   479232        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.390140] [3532933] 100033 3532933   118971     5064     1864      160      3040   503808        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.401854] [3534151] 100104 3534151   567344   168822     1686      160    166976  2408448        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.413852] [3535832] 100104 3535832   569005    41042     2578      128     38336  1150976        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.425919] [3550993] 100033 3550993   118029     1768     1544      224         0   425984        0             0 apache2
Jul  4 20:00:13 pppve1 kernel: [3375933.437868] [3560475]   107 3560475    10767      928      160      768         0    77824        0             0 pickup
Jul  4 20:00:13 pppve1 kernel: [3375933.449513] [3563017] 100101 3563017     9838      256       96      160         0   122880        0             0 pickup
Jul  4 20:00:13 pppve1 kernel: [3375933.461255] [3575085] 100100 3575085     9561      288      128      160         0   118784        0             0 pickup
Jul  4 20:00:13 pppve1 kernel: [3375933.473119] [3579986]     0 3579986     1367      384        0      384         0    49152        0             0 sleep
Jul  4 20:00:13 pppve1 kernel: [3375933.484646] [3579996] 100104 3579996   566249     5031      615      128      4288   450560        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.496645] [3580020]     0 3580020    91269    30310    29606      704         0   409600        0             0 pvescheduler
Jul  4 20:00:13 pppve1 kernel: [3375933.509585] [3580041]     0 3580041     5005     1920      640     1280         0    81920        0           100 systemd
Jul  4 20:00:13 pppve1 kernel: [3375933.521297] [3580044]     0 3580044    42685     1538     1218      320         0   102400        0           100 (sd-pam)
Jul  4 20:00:13 pppve1 kernel: [3375933.533226] [3580125] 100104 3580125   566119     5607      583      704      4320   446464        0             0 postgres
Jul  4 20:00:13 pppve1 kernel: [3375933.545245] [3580193]     0 3580193     4403     2368      384     1984         0    81920        0             0 sshd
Jul  4 20:00:13 pppve1 kernel: [3375933.556849] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=qemu.slice,mems_allowed=0,global_oom,task_memcg=/qemu.slice/121.scope,task=kvm,pid=6368,uid=0
Jul  4 20:00:13 pppve1 kernel: [3375933.573133] Out of memory: Killed process 6368 (kvm) total-vm:22049932kB, anon-rss:16913368kB, file-rss:2944kB, shmem-rss:0kB, UID:0 pgtables:34132kB oom_score_adj:0
Jul  4 20:00:15 pppve1 kernel: [3375935.378441]  zd16: p1 p2 p3 < p5 p6 >
Jul  4 20:00:16 pppve1 kernel: [3375936.735383] oom_reaper: reaped process 6368 (kvm), now anon-rss:0kB, file-rss:32kB, shmem-rss:0kB
Jul  4 20:01:11 pppve1 kernel: [3375991.767379] vmbr0: port 5(tap121i0) entered disabled state
Jul  4 20:01:11 pppve1 kernel: [3375991.778143] tap121i0 (unregistering): left allmulticast mode
Jul  4 20:01:11 pppve1 kernel: [3375991.785976] vmbr0: port 5(tap121i0) entered disabled state
Jul  4 20:01:11 pppve1 kernel: [3375991.791555]  zd128: p1
Jul  4 20:01:13 pppve1 kernel: [3375993.594688]  zd176: p1 p2


> Makes few sense that OOM triggers in 64GB hosts with just 24GB configured in
> VMs and, probably, less real usage. IMHO it's not VMs what fill your memory
> up to the point of OOM, but some other process, ZFS ARC, maybe even some mem
> leak. Maybe some process is producing severe memory fragmentation.

i can confirm that server was doing some heavy I/O (backup), but AFAIK
nothing more.


Mandi! Roland

> it's a little bit weird that OOM kicks in with VMs <32GB RAM when you have 64GB
> take a closer look why this happens , i.e. why OOM thinks there is ram pressure

effectively server was running:
 + vm 100, 2GB
 + vm 120, 4GB
 + vm 121, 16GB
 + vm 127, 4GB
 + lxc 124, 2GB
 + lxc 125, 4GB

so exactly 32GB of RAM. But most of the VM/LXC barely arrived at half of the
allocated RAM...



Thanks.

-- 


_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-08 16:31   ` Marco Gaiarin
@ 2025-07-10  8:56     ` Victor Rodriguez
  2025-07-10  9:08       ` Roland via pve-user
  0 siblings, 1 reply; 10+ messages in thread
From: Victor Rodriguez @ 2025-07-10  8:56 UTC (permalink / raw)
  To: Proxmox VE user list, Marco Gaiarin

Hi,

Checked the OOM log and for me the conclusion is clear (disclaimer, 
numbers might not be exact):

- You had around 26.7G used mem by processes + 2.3G for shared memory:

active_anon:17033436kB
inactive_anon:10633368kB
shmem:2325708kB
mapped:2285988kB
unevictable:158204kB

- Seems like you are also using ZFS (some zd* disks appear in the log) 
and given that you were doing backups at the time of the OOM, I will 
suppose that that your ARC size is set to 50% of the hosts memory (check 
with arc_summary), so another 32G of used memory. ARC is reclaimable by 
the host, but usually ZFS does not return that memory fast enough, 
specially during heavy use of the ARC (i.e. reading for a backup), so 
can't really count on that memory.

- Memory was quite framented and only small pages were available:

Node 0 Normal:
   16080*4kB
   36886*8kB
   22890*16kB
   4687*32kB
   159*64kB
   10*128kB
   0*256kB
   0*512kB
   0*1024kB
   0*2048kB
   0*4096kB


Conclusions:

You had 32+26.7+2.3 ≃ 61G of used memory, with the ~3G available being 
small blocks that can't be used for the typically large allocations that 
VMs do. You host had no choice but to trigger OOM.


What I would do:

- Lower ARC size [1]
- Add some swap (never place it in a ZFS disk!). Even some ZRAM could help.
- Lower your VMs memory, either the total, either the minimum memory 
(balloon) or both. Check that VirtIO drivers + balloon driver is 
installed and working so the host can reclaim memory from the guests.
- Get more ram :)


Regards


[1] 
https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage



On 7/8/25 18:31, Marco Gaiarin wrote:
> Mandi! Victor Rodriguez
>    In chel di` si favelave...
>
>> I would start by analyzing the memory status at the time of the OOM. There
>> should be a some lines in journal/syslog were the kernel writes what the
>> memory looked like and you can figure out why it had to kill a process.
> This is the full OOM log:
>
> Jul  4 20:00:12 pppve1 kernel: [3375931.660119] kvm invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
> Jul  4 20:00:12 pppve1 kernel: [3375931.669158] CPU: 1 PID: 4088 Comm: kvm Tainted: P           O       6.8.12-10-pve #1
> Jul  4 20:00:12 pppve1 kernel: [3375931.677778] Hardware name: Dell Inc. PowerEdge T440/021KCD, BIOS 2.24.0 04/02/2025
> Jul  4 20:00:12 pppve1 kernel: [3375931.686211] Call Trace:
> Jul  4 20:00:12 pppve1 kernel: [3375931.689504]  <TASK>
> Jul  4 20:00:12 pppve1 kernel: [3375931.692428]  dump_stack_lvl+0x76/0xa0
> Jul  4 20:00:12 pppve1 kernel: [3375931.696915]  dump_stack+0x10/0x20
> Jul  4 20:00:12 pppve1 kernel: [3375931.701057]  dump_header+0x47/0x1f0
> Jul  4 20:00:12 pppve1 kernel: [3375931.705358]  oom_kill_process+0x110/0x240
> Jul  4 20:00:12 pppve1 kernel: [3375931.710169]  out_of_memory+0x26e/0x560
> Jul  4 20:00:12 pppve1 kernel: [3375931.714707]  __alloc_pages+0x10ce/0x1320
> Jul  4 20:00:12 pppve1 kernel: [3375931.719422]  alloc_pages_mpol+0x91/0x1f0
> Jul  4 20:00:12 pppve1 kernel: [3375931.724136]  alloc_pages+0x54/0xb0
> Jul  4 20:00:12 pppve1 kernel: [3375931.728320]  __get_free_pages+0x11/0x50
> Jul  4 20:00:12 pppve1 kernel: [3375931.732938]  __pollwait+0x9e/0xe0
> Jul  4 20:00:12 pppve1 kernel: [3375931.737015]  eventfd_poll+0x2c/0x70
> Jul  4 20:00:12 pppve1 kernel: [3375931.741261]  do_sys_poll+0x2f4/0x610
> Jul  4 20:00:12 pppve1 kernel: [3375931.745587]  ? __pfx___pollwait+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.750332]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.754900]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.759463]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.764011]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.768617]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.773165]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.777688]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.782156]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.786622]  ? __pfx_pollwake+0x10/0x10
> Jul  4 20:00:12 pppve1 kernel: [3375931.791111]  __x64_sys_ppoll+0xde/0x170
> Jul  4 20:00:12 pppve1 kernel: [3375931.795656]  x64_sys_call+0x1818/0x2480
> Jul  4 20:00:12 pppve1 kernel: [3375931.800193]  do_syscall_64+0x81/0x170
> Jul  4 20:00:12 pppve1 kernel: [3375931.804485]  ? __x64_sys_ppoll+0xf2/0x170
> Jul  4 20:00:12 pppve1 kernel: [3375931.809100]  ? syscall_exit_to_user_mode+0x86/0x260
> Jul  4 20:00:12 pppve1 kernel: [3375931.814566]  ? do_syscall_64+0x8d/0x170
> Jul  4 20:00:12 pppve1 kernel: [3375931.818979]  ? syscall_exit_to_user_mode+0x86/0x260
> Jul  4 20:00:12 pppve1 kernel: [3375931.824425]  ? do_syscall_64+0x8d/0x170
> Jul  4 20:00:12 pppve1 kernel: [3375931.828825]  ? clear_bhb_loop+0x15/0x70
> Jul  4 20:00:12 pppve1 kernel: [3375931.833211]  ? clear_bhb_loop+0x15/0x70
> Jul  4 20:00:12 pppve1 kernel: [3375931.837579]  ? clear_bhb_loop+0x15/0x70
> Jul  4 20:00:12 pppve1 kernel: [3375931.841928]  entry_SYSCALL_64_after_hwframe+0x78/0x80
> Jul  4 20:00:12 pppve1 kernel: [3375931.847482] RIP: 0033:0x765bb1ce8316
> Jul  4 20:00:12 pppve1 kernel: [3375931.851577] Code: 7c 24 08 e8 2c 95 f8 ff 4c 8b 54 24 18 48 8b 74 24 10 41 b8 08 00 00 00 41 89 c1 48 8b 7c 24 08 4c 89 e2 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 44 89 cf 89 44 24 08 e8 76 95 f8 ff 8b 44
> Jul  4 20:00:12 pppve1 kernel: [3375931.871194] RSP: 002b:00007fff2d39ea20 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
> Jul  4 20:00:12 pppve1 kernel: [3375931.879298] RAX: ffffffffffffffda RBX: 00006045d3e68470 RCX: 0000765bb1ce8316
> Jul  4 20:00:12 pppve1 kernel: [3375931.886963] RDX: 00007fff2d39ea40 RSI: 0000000000000010 RDI: 00006045d4de5f20
> Jul  4 20:00:12 pppve1 kernel: [3375931.894630] RBP: 00007fff2d39eaac R08: 0000000000000008 R09: 0000000000000000
> Jul  4 20:00:12 pppve1 kernel: [3375931.902299] R10: 0000000000000000 R11: 0000000000000293 R12: 00007fff2d39ea40
> Jul  4 20:00:12 pppve1 kernel: [3375931.909951] R13: 00006045d3e68470 R14: 00006045b014d570 R15: 00007fff2d39eab0
> Jul  4 20:00:12 pppve1 kernel: [3375931.917656]  </TASK>
> Jul  4 20:00:12 pppve1 kernel: [3375931.920515] Mem-Info:
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] active_anon:4467063 inactive_anon:2449638 isolated_anon:0
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  active_file:611 inactive_file:303 isolated_file:0
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  unevictable:39551 dirty:83 writeback:237
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  slab_reclaimable:434580 slab_unreclaimable:1792355
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  mapped:571491 shmem:581427 pagetables:26365
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  sec_pagetables:11751 bounce:0
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  kernel_misc_reclaimable:0
> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  free:234516 free_pcp:5874 free_cma:0
> Jul  4 20:00:12 pppve1 kernel: [3375931.969518] Node 0 active_anon:17033436kB inactive_anon:10633368kB active_file:64kB inactive_file:3196kB unevictable:158204kB isolated(anon):0kB isolated(file):0kB mapped:2285988kB dirty:356kB writeback:948kB shmem:2325708kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:866304kB writeback_tmp:0kB kernel_stack:11520kB pagetables:105460kB sec_pagetables:47004kB all_unreclaimable? no
> Jul  4 20:00:12 pppve1 kernel: [3375932.004977] Node 0 DMA free:11264kB boost:0kB min:12kB low:24kB high:36kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.032646] lowmem_reserve[]: 0 1527 63844 63844 63844
> Jul  4 20:00:12 pppve1 kernel: [3375932.038675] Node 0 DMA32 free:252428kB boost:0kB min:1616kB low:3176kB high:4736kB reserved_highatomic:2048KB active_anon:310080kB inactive_anon:986436kB active_file:216kB inactive_file:0kB unevictable:0kB writepending:0kB present:1690624kB managed:1623508kB mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.069110] lowmem_reserve[]: 0 0 62317 62317 62317
> Jul  4 20:00:12 pppve1 kernel: [3375932.074979] Node 0 Normal free:814396kB boost:290356kB min:356304kB low:420116kB high:483928kB reserved_highatomic:346112KB active_anon:11258684kB inactive_anon:15111580kB active_file:0kB inactive_file:2316kB unevictable:158204kB writepending:1304kB present:65011712kB managed:63820796kB mlocked:155132kB bounce:0kB free_pcp:12728kB local_pcp:0kB free_cma:0kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.109188] lowmem_reserve[]: 0 0 0 0 0
> Jul  4 20:00:12 pppve1 kernel: [3375932.114119] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 2*4096kB (M) = 11264kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.127796] Node 0 DMA32: 5689*4kB (UMH) 1658*8kB (UMH) 381*16kB (UM) 114*32kB (UME) 97*64kB (UME) 123*128kB (UMEH) 87*256kB (MEH) 96*512kB (UMEH) 58*1024kB (UME) 5*2048kB (UME) 11*4096kB (ME) = 253828kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.148050] Node 0 Normal: 16080*4kB (UMEH) 36886*8kB (UMEH) 22890*16kB (UMEH) 4687*32kB (MEH) 159*64kB (UMEH) 10*128kB (UE) 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 887088kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.165899] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.175876] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.185569] 586677 total pagecache pages
> Jul  4 20:00:12 pppve1 kernel: [3375932.190737] 0 pages in swap cache
> Jul  4 20:00:12 pppve1 kernel: [3375932.195285] Free swap  = 0kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.199404] Total swap = 0kB
> Jul  4 20:00:12 pppve1 kernel: [3375932.203513] 16679583 pages RAM
> Jul  4 20:00:12 pppve1 kernel: [3375932.207787] 0 pages HighMem/MovableOnly
> Jul  4 20:00:12 pppve1 kernel: [3375932.212819] 314667 pages reserved
> Jul  4 20:00:12 pppve1 kernel: [3375932.217321] 0 pages hwpoisoned
> Jul  4 20:00:12 pppve1 kernel: [3375932.221525] Tasks state (memory values in pages):
> Jul  4 20:00:12 pppve1 kernel: [3375932.227400] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
> Jul  4 20:00:12 pppve1 kernel: [3375932.239680] [   1959]   106  1959     1971      544       96      448         0    61440        0             0 rpcbind
> Jul  4 20:00:12 pppve1 kernel: [3375932.251672] [   1982]   104  1982     2350      672      160      512         0    57344        0          -900 dbus-daemon
> Jul  4 20:00:12 pppve1 kernel: [3375932.264020] [   1991]     0  1991     1767      275       83      192         0    57344        0             0 ksmtuned
> Jul  4 20:00:12 pppve1 kernel: [3375932.276094] [   1995]     0  1995    69541      480       64      416         0    86016        0             0 pve-lxc-syscall
> Jul  4 20:00:12 pppve1 kernel: [3375932.289686] [   2002]     0  2002     1330      384       32      352         0    53248        0             0 qmeventd
> Jul  4 20:00:12 pppve1 kernel: [3375932.301811] [   2003]     0  2003    55449      727      247      480         0    86016        0             0 rsyslogd
> Jul  4 20:00:12 pppve1 kernel: [3375932.313919] [   2004]     0  2004     3008      928      448      480         0    69632        0             0 smartd
> Jul  4 20:00:12 pppve1 kernel: [3375932.325834] [   2009]     0  2009     6386      992      224      768         0    77824        0             0 systemd-logind
> Jul  4 20:00:12 pppve1 kernel: [3375932.339438] [   2010]     0  2010      584      256        0      256         0    40960        0         -1000 watchdog-mux
> Jul  4 20:00:12 pppve1 kernel: [3375932.352936] [   2021]     0  2021    60174      928      256      672         0    90112        0             0 zed
> Jul  4 20:00:12 pppve1 kernel: [3375932.364626] [   2136]     0  2136    75573      256       64      192         0    86016        0         -1000 lxcfs
> Jul  4 20:00:12 pppve1 kernel: [3375932.376485] [   2397]     0  2397     2208      480       64      416         0    61440        0             0 lxc-monitord
> Jul  4 20:00:12 pppve1 kernel: [3375932.389169] [   2421]     0  2421    40673      454       70      384         0    73728        0             0 apcupsd
> Jul  4 20:00:12 pppve1 kernel: [3375932.400685] [   2426]     0  2426     3338      428      172      256         0    69632        0             0 iscsid
> Jul  4 20:00:12 pppve1 kernel: [3375932.412121] [   2427]     0  2427     3464     3343      431     2912         0    77824        0           -17 iscsid
> Jul  4 20:00:12 pppve1 kernel: [3375932.423754] [   2433]     0  2433     3860     1792      320     1472         0    77824        0         -1000 sshd
> Jul  4 20:00:12 pppve1 kernel: [3375932.435208] [   2461]     0  2461   189627     2688     1344     1344         0   155648        0             0 dsm_ism_srvmgrd
> Jul  4 20:00:12 pppve1 kernel: [3375932.448290] [   2490]   113  2490     4721      750      142      608         0    61440        0             0 chronyd
> Jul  4 20:00:12 pppve1 kernel: [3375932.459988] [   2492]   113  2492     2639      502      118      384         0    61440        0             0 chronyd
> Jul  4 20:00:12 pppve1 kernel: [3375932.471684] [   2531]     0  2531     1469      448       32      416         0    49152        0             0 agetty
> Jul  4 20:00:12 pppve1 kernel: [3375932.483269] [   2555]     0  2555   126545      673      244      429         0   147456        0             0 rrdcached
> Jul  4 20:00:12 pppve1 kernel: [3375932.483275] [   2582]     0  2582   155008    15334     3093      864     11377   434176        0             0 pmxcfs
> Jul  4 20:00:12 pppve1 kernel: [3375932.506653] [   2654]     0  2654    10667      614      134      480         0    77824        0             0 master
> Jul  4 20:00:12 pppve1 kernel: [3375932.517986] [   2656]   107  2656    10812      704      160      544         0    73728        0             0 qmgr
> Jul  4 20:00:12 pppve1 kernel: [3375932.529118] [   2661]     0  2661   139892    41669    28417     2980     10272   405504        0             0 corosync
> Jul  4 20:00:12 pppve1 kernel: [3375932.540553] [   2662]     0  2662     1653      576       32      544         0    53248        0             0 cron
> Jul  4 20:00:12 pppve1 kernel: [3375932.551657] [   2664]     0  2664     1621      480       96      384         0    57344        0             0 proxmox-firewal
> Jul  4 20:00:12 pppve1 kernel: [3375932.564093] [   3164]     0  3164    83332    26227    25203      768       256   360448        0             0 pve-firewall
> Jul  4 20:00:12 pppve1 kernel: [3375932.576192] [   3233]     0  3233    85947    28810    27242     1216       352   385024        0             0 pvestatd
> Jul  4 20:00:12 pppve1 kernel: [3375932.587638] [   3417]     0  3417    93674    36011    35531      480         0   438272        0             0 pvedaemon
> Jul  4 20:00:12 pppve1 kernel: [3375932.599167] [   3421]     0  3421    95913    37068    35884     1120        64   454656        0             0 pvedaemon worke
> Jul  4 20:00:12 pppve1 kernel: [3375932.611536] [   3424]     0  3424    96072    36972    35852     1088        32   454656        0             0 pvedaemon worke
> Jul  4 20:00:12 pppve1 kernel: [3375932.623977] [   3426]     0  3426    96167    37068    35948     1056        64   458752        0             0 pvedaemon worke
> Jul  4 20:00:12 pppve1 kernel: [3375932.636698] [   3558]     0  3558    90342    29540    28676      608       256   385024        0             0 pve-ha-crm
> Jul  4 20:00:12 pppve1 kernel: [3375932.648477] [   3948]    33  3948    94022    37705    35849     1856         0   471040        0             0 pveproxy
> Jul  4 20:00:12 pppve1 kernel: [3375932.660083] [   3954]    33  3954    21688    14368    12736     1632         0   221184        0             0 spiceproxy
> Jul  4 20:00:12 pppve1 kernel: [3375932.671862] [   3956]     0  3956    90222    29321    28521      544       256   397312        0             0 pve-ha-lrm
> Jul  4 20:00:12 pppve1 kernel: [3375932.683484] [   3994]     0  3994  1290140   706601   705993      608         0  6389760        0             0 kvm
> Jul  4 20:00:12 pppve1 kernel: [3375932.694551] [   4088]     0  4088  1271416  1040767  1040223      544         0  8994816        0             0 kvm
> Jul  4 20:00:12 pppve1 kernel: [3375932.705624] [   4160]     0  4160    89394    30149    29541      608         0   380928        0             0 pvescheduler
> Jul  4 20:00:12 pppve1 kernel: [3375932.717864] [   4710]     0  4710     1375      480       32      448         0    57344        0             0 agetty
> Jul  4 20:00:12 pppve1 kernel: [3375932.729183] [   5531]     0  5531   993913   567351   566647      704         0  5611520        0             0 kvm
> Jul  4 20:00:12 pppve1 kernel: [3375932.740212] [   6368]     0  6368  5512483  4229046  4228342      704         0 34951168        0             0 kvm
> Jul  4 20:00:12 pppve1 kernel: [3375932.751255] [   9796]     0  9796     1941      768       64      704         0    57344        0             0 lxc-start
> Jul  4 20:00:12 pppve1 kernel: [3375932.762840] [   9808] 100000  9808     3875      160       32      128         0    77824        0             0 init
> Jul  4 20:00:12 pppve1 kernel: [3375932.774063] [  11447] 100000 11447     9272      192       64      128         0   118784        0             0 rpcbind
> Jul  4 20:00:12 pppve1 kernel: [3375932.785534] [  11620] 100000 11620    45718      240      112      128         0   126976        0             0 rsyslogd
> Jul  4 20:00:12 pppve1 kernel: [3375932.797241] [  11673] 100000 11673     4758      195       35      160         0    81920        0             0 atd
> Jul  4 20:00:12 pppve1 kernel: [3375932.808516] [  11748] 100000 11748     6878      228       36      192         0    98304        0             0 cron
> Jul  4 20:00:12 pppve1 kernel: [3375932.819868] [  11759] 100102 11759    10533      257       65      192         0   122880        0             0 dbus-daemon
> Jul  4 20:00:12 pppve1 kernel: [3375932.832328] [  11765] 100000 11765    13797      315      155      160         0   143360        0             0 sshd
> Jul  4 20:00:12 pppve1 kernel: [3375932.843547] [  11989] 100104 11989   565602    19744      288      160     19296   372736        0             0 postgres
> Jul  4 20:00:12 pppve1 kernel: [3375932.855266] [  12169] 100104 12169   565938   537254      678      192    536384  4517888        0             0 postgres
> Jul  4 20:00:12 pppve1 kernel: [3375932.866950] [  12170] 100104 12170   565859   199654      550      224    198880  4296704        0             0 postgres
> Jul  4 20:00:12 pppve1 kernel: [3375932.878525] [  12171] 100104 12171   565859     4710      358      224      4128   241664        0             0 postgres
> Jul  4 20:00:12 pppve1 kernel: [3375932.890252] [  12172] 100104 12172   565962     7654      518      192      6944   827392        0             0 postgres
> Jul  4 20:00:12 pppve1 kernel: [3375932.901845] [  12173] 100104 12173    20982      742      518      224         0   200704        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375932.913421] [  13520] 100000 13520     9045      192      128       64         0   114688        0             0 master
> Jul  4 20:00:13 pppve1 kernel: [3375932.924809] [  13536] 100100 13536     9601      320      128      192         0   126976        0             0 qmgr
> Jul  4 20:00:13 pppve1 kernel: [3375932.936088] [  13547] 100000 13547     3168      192       32      160         0    73728        0             0 getty
> Jul  4 20:00:13 pppve1 kernel: [3375932.947424] [  13548] 100000 13548     3168      160       32      128         0    73728        0             0 getty
> Jul  4 20:00:13 pppve1 kernel: [3375932.958761] [1302486]     0 1302486     1941      768       96      672         0    53248        0             0 lxc-start
> Jul  4 20:00:13 pppve1 kernel: [3375932.970490] [1302506] 100000 1302506     2115      128       32       96         0    65536        0             0 init
> Jul  4 20:00:13 pppve1 kernel: [3375932.981999] [1302829] 100001 1302829     2081      128        0      128         0    61440        0             0 portmap
> Jul  4 20:00:13 pppve1 kernel: [3375932.993763] [1302902] 100000 1302902    27413      160       64       96         0   122880        0             0 rsyslogd
> Jul  4 20:00:13 pppve1 kernel: [3375933.005719] [1302953] 100000 1302953   117996     1654     1366      227        61   450560        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.017459] [1302989] 100000 1302989     4736       97       33       64         0    81920        0             0 atd
> Jul  4 20:00:13 pppve1 kernel: [3375933.028905] [1303004] 100104 1303004     5843       64       32       32         0    94208        0             0 dbus-daemon
> Jul  4 20:00:13 pppve1 kernel: [3375933.041272] [1303030] 100000 1303030    12322      334      110      224         0   139264        0             0 sshd
> Jul  4 20:00:13 pppve1 kernel: [3375933.052755] [1303048] 100000 1303048     5664       64       32       32         0    94208        0             0 cron
> Jul  4 20:00:13 pppve1 kernel: [3375933.064220] [1303255] 100000 1303255     9322      224       96      128         0   118784        0             0 master
> Jul  4 20:00:13 pppve1 kernel: [3375933.075896] [1303284] 100101 1303284     9878      352      128      224         0   122880        0             0 qmgr
> Jul  4 20:00:13 pppve1 kernel: [3375933.087405] [1303285] 100000 1303285     1509       32        0       32         0    61440        0             0 getty
> Jul  4 20:00:13 pppve1 kernel: [3375933.099008] [1303286] 100000 1303286     1509       64        0       64         0    61440        0             0 getty
> Jul  4 20:00:13 pppve1 kernel: [3375933.110571] [1420994]    33 1420994    21749    13271    12759      512         0   204800        0             0 spiceproxy work
> Jul  4 20:00:13 pppve1 kernel: [3375933.123378] [1421001]    33 1421001    94055    37044    35892     1152         0   434176        0             0 pveproxy worker
> Jul  4 20:00:13 pppve1 kernel: [3375933.136284] [1421002]    33 1421002    94055    36980    35860     1120         0   434176        0             0 pveproxy worker
> Jul  4 20:00:13 pppve1 kernel: [3375933.149173] [1421003]    33 1421003    94055    37044    35892     1152         0   434176        0             0 pveproxy worker
> Jul  4 20:00:13 pppve1 kernel: [3375933.162040] [2316827]     0 2316827     6820     1088      224      864         0    69632        0         -1000 systemd-udevd
> Jul  4 20:00:13 pppve1 kernel: [3375933.174778] [2316923]     0 2316923    51282     2240      224     2016         0   438272        0          -250 systemd-journal
> Jul  4 20:00:13 pppve1 kernel: [3375933.187768] [3148356]     0 3148356    32681    21120    19232     1888         0   249856        0             0 glpi-agent (tag
> Jul  4 20:00:13 pppve1 kernel: [3375933.200481] [3053571]     0 3053571    19798      480       32      448         0    57344        0             0 pvefw-logger
> Jul  4 20:00:13 pppve1 kernel: [3375933.212970] [3498513] 100033 3498513   119792     7207     2632      223      4352   516096        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.224713] [3498820] 100104 3498820   575918   235975     9351      160    226464  3424256        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.236579] [3500997] 100033 3500997   119889     7202     2594      192      4416   524288        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.248240] [3501657] 100104 3501657   571325   199025     6001      160    192864  2945024        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.260100] [3502514] 100033 3502514   119119     5907     2004      191      3712   503808        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.271772] [3503679] 100104 3503679   575295   211508     6612      192    204704  2953216        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.283619] [3515234] 100033 3515234   119042     6568     1960      192      4416   503808        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.295362] [3515420] 100104 3515420   569839    97579     4491      160     92928  2293760        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.307155] [3520282] 100033 3520282   119129     5416     2056      192      3168   495616        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.318923] [3520287] 100033 3520287   119015     5709     1894      167      3648   503808        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.330805] [3520288] 100033 3520288   119876     5961     2729      224      3008   507904        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.342648] [3521057] 100104 3521057   573824    46069     8341      128     37600  1830912        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.354567] [3521067] 100104 3521067   574768    99734     7446       96     92192  2134016        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.366512] [3521301] 100104 3521301   569500   174722     4194      160    170368  2482176        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.378484] [3532810] 100033 3532810   118740     4127     1727      160      2240   479232        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.390140] [3532933] 100033 3532933   118971     5064     1864      160      3040   503808        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.401854] [3534151] 100104 3534151   567344   168822     1686      160    166976  2408448        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.413852] [3535832] 100104 3535832   569005    41042     2578      128     38336  1150976        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.425919] [3550993] 100033 3550993   118029     1768     1544      224         0   425984        0             0 apache2
> Jul  4 20:00:13 pppve1 kernel: [3375933.437868] [3560475]   107 3560475    10767      928      160      768         0    77824        0             0 pickup
> Jul  4 20:00:13 pppve1 kernel: [3375933.449513] [3563017] 100101 3563017     9838      256       96      160         0   122880        0             0 pickup
> Jul  4 20:00:13 pppve1 kernel: [3375933.461255] [3575085] 100100 3575085     9561      288      128      160         0   118784        0             0 pickup
> Jul  4 20:00:13 pppve1 kernel: [3375933.473119] [3579986]     0 3579986     1367      384        0      384         0    49152        0             0 sleep
> Jul  4 20:00:13 pppve1 kernel: [3375933.484646] [3579996] 100104 3579996   566249     5031      615      128      4288   450560        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.496645] [3580020]     0 3580020    91269    30310    29606      704         0   409600        0             0 pvescheduler
> Jul  4 20:00:13 pppve1 kernel: [3375933.509585] [3580041]     0 3580041     5005     1920      640     1280         0    81920        0           100 systemd
> Jul  4 20:00:13 pppve1 kernel: [3375933.521297] [3580044]     0 3580044    42685     1538     1218      320         0   102400        0           100 (sd-pam)
> Jul  4 20:00:13 pppve1 kernel: [3375933.533226] [3580125] 100104 3580125   566119     5607      583      704      4320   446464        0             0 postgres
> Jul  4 20:00:13 pppve1 kernel: [3375933.545245] [3580193]     0 3580193     4403     2368      384     1984         0    81920        0             0 sshd
> Jul  4 20:00:13 pppve1 kernel: [3375933.556849] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=qemu.slice,mems_allowed=0,global_oom,task_memcg=/qemu.slice/121.scope,task=kvm,pid=6368,uid=0
> Jul  4 20:00:13 pppve1 kernel: [3375933.573133] Out of memory: Killed process 6368 (kvm) total-vm:22049932kB, anon-rss:16913368kB, file-rss:2944kB, shmem-rss:0kB, UID:0 pgtables:34132kB oom_score_adj:0
> Jul  4 20:00:15 pppve1 kernel: [3375935.378441]  zd16: p1 p2 p3 < p5 p6 >
> Jul  4 20:00:16 pppve1 kernel: [3375936.735383] oom_reaper: reaped process 6368 (kvm), now anon-rss:0kB, file-rss:32kB, shmem-rss:0kB
> Jul  4 20:01:11 pppve1 kernel: [3375991.767379] vmbr0: port 5(tap121i0) entered disabled state
> Jul  4 20:01:11 pppve1 kernel: [3375991.778143] tap121i0 (unregistering): left allmulticast mode
> Jul  4 20:01:11 pppve1 kernel: [3375991.785976] vmbr0: port 5(tap121i0) entered disabled state
> Jul  4 20:01:11 pppve1 kernel: [3375991.791555]  zd128: p1
> Jul  4 20:01:13 pppve1 kernel: [3375993.594688]  zd176: p1 p2
>
>
>> Makes few sense that OOM triggers in 64GB hosts with just 24GB configured in
>> VMs and, probably, less real usage. IMHO it's not VMs what fill your memory
>> up to the point of OOM, but some other process, ZFS ARC, maybe even some mem
>> leak. Maybe some process is producing severe memory fragmentation.
> i can confirm that server was doing some heavy I/O (backup), but AFAIK
> nothing more.
>
>
> Mandi! Roland
>
>> it's a little bit weird that OOM kicks in with VMs <32GB RAM when you have 64GB
>> take a closer look why this happens , i.e. why OOM thinks there is ram pressure
> effectively server was running:
>   + vm 100, 2GB
>   + vm 120, 4GB
>   + vm 121, 16GB
>   + vm 127, 4GB
>   + lxc 124, 2GB
>   + lxc 125, 4GB
>
> so exactly 32GB of RAM. But most of the VM/LXC barely arrived at half of the
> allocated RAM...
>
>
>
> Thanks.
>
-- 
_______________________________________________

SOLTECSIS SOLUCIONES TECNOLOGICAS, S.L.
Víctor Rodríguez Cortés
Teléfono: 966 446 046
vrodriguez@soltecsis.com
www.soltecsis.com
_______________________________________________

La información contenida en este e-mail es confidencial,
siendo para uso exclusivo del destinatario arriba mencionado.
Le informamos que está totalmente prohibida cualquier
utilización, divulgación, distribución y/o reproducción de
esta comunicación sin autorización expresa en virtud de la
legislación vigente. Si ha recibido este mensaje por error,
le rogamos nos lo notifique inmediatamente por la misma vía
y proceda a su eliminación.


_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-10  8:56     ` Victor Rodriguez
@ 2025-07-10  9:08       ` Roland via pve-user
  2025-07-10 14:49         ` dorsy via pve-user
  0 siblings, 1 reply; 10+ messages in thread
From: Roland via pve-user @ 2025-07-10  9:08 UTC (permalink / raw)
  To: Proxmox VE user list, Victor Rodriguez, Marco Gaiarin; +Cc: Roland

[-- Attachment #1: Type: message/rfc822, Size: 59734 bytes --]

From: Roland <devzero@web.de>
To: Proxmox VE user list <pve-user@lists.proxmox.com>, Victor Rodriguez <vrodriguez@soltecsis.com>, Marco Gaiarin <gaio@lilliput.linux.it>
Subject: Re: [PVE-User] A less aggressive OOM?
Date: Thu, 10 Jul 2025 11:08:31 +0200
Message-ID: <54f612ef-4dbc-4382-8ee8-2e11e860b34b@web.de>

if OOM kicks in because half of the ram is being used for 
caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
should be resolved at zfs or memory management level.

Why kill processes instead of reclaiming arc ? i think that's totally 
wrong behaviour.

will watch out for appropriate zfs github issue or we should consider 
open up one.

roland

Am 10.07.25 um 10:56 schrieb Victor Rodriguez:
> Hi,
>
> Checked the OOM log and for me the conclusion is clear (disclaimer, 
> numbers might not be exact):
>
> - You had around 26.7G used mem by processes + 2.3G for shared memory:
>
> active_anon:17033436kB
> inactive_anon:10633368kB
> shmem:2325708kB
> mapped:2285988kB
> unevictable:158204kB
>
> - Seems like you are also using ZFS (some zd* disks appear in the log) 
> and given that you were doing backups at the time of the OOM, I will 
> suppose that that your ARC size is set to 50% of the hosts memory 
> (check with arc_summary), so another 32G of used memory. ARC is 
> reclaimable by the host, but usually ZFS does not return that memory 
> fast enough, specially during heavy use of the ARC (i.e. reading for a 
> backup), so can't really count on that memory.
>
> - Memory was quite framented and only small pages were available:
>
> Node 0 Normal:
>   16080*4kB
>   36886*8kB
>   22890*16kB
>   4687*32kB
>   159*64kB
>   10*128kB
>   0*256kB
>   0*512kB
>   0*1024kB
>   0*2048kB
>   0*4096kB
>
>
> Conclusions:
>
> You had 32+26.7+2.3 ≃ 61G of used memory, with the ~3G available being 
> small blocks that can't be used for the typically large allocations 
> that VMs do. You host had no choice but to trigger OOM.
>
>
> What I would do:
>
> - Lower ARC size [1]
> - Add some swap (never place it in a ZFS disk!). Even some ZRAM could 
> help.
> - Lower your VMs memory, either the total, either the minimum memory 
> (balloon) or both. Check that VirtIO drivers + balloon driver is 
> installed and working so the host can reclaim memory from the guests.
> - Get more ram :)
>
>
> Regards
>
>
> [1] 
> https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
>
>
>
> On 7/8/25 18:31, Marco Gaiarin wrote:
>> Mandi! Victor Rodriguez
>>    In chel di` si favelave...
>>
>>> I would start by analyzing the memory status at the time of the OOM. 
>>> There
>>> should be a some lines in journal/syslog were the kernel writes what 
>>> the
>>> memory looked like and you can figure out why it had to kill a process.
>> This is the full OOM log:
>>
>> Jul  4 20:00:12 pppve1 kernel: [3375931.660119] kvm invoked 
>> oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.669158] CPU: 1 PID: 4088 
>> Comm: kvm Tainted: P           O       6.8.12-10-pve #1
>> Jul  4 20:00:12 pppve1 kernel: [3375931.677778] Hardware name: Dell 
>> Inc. PowerEdge T440/021KCD, BIOS 2.24.0 04/02/2025
>> Jul  4 20:00:12 pppve1 kernel: [3375931.686211] Call Trace:
>> Jul  4 20:00:12 pppve1 kernel: [3375931.689504]  <TASK>
>> Jul  4 20:00:12 pppve1 kernel: [3375931.692428] dump_stack_lvl+0x76/0xa0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.696915] dump_stack+0x10/0x20
>> Jul  4 20:00:12 pppve1 kernel: [3375931.701057] dump_header+0x47/0x1f0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.705358] 
>> oom_kill_process+0x110/0x240
>> Jul  4 20:00:12 pppve1 kernel: [3375931.710169] 
>> out_of_memory+0x26e/0x560
>> Jul  4 20:00:12 pppve1 kernel: [3375931.714707] 
>> __alloc_pages+0x10ce/0x1320
>> Jul  4 20:00:12 pppve1 kernel: [3375931.719422] 
>> alloc_pages_mpol+0x91/0x1f0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.724136] alloc_pages+0x54/0xb0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.728320] 
>> __get_free_pages+0x11/0x50
>> Jul  4 20:00:12 pppve1 kernel: [3375931.732938] __pollwait+0x9e/0xe0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.737015] eventfd_poll+0x2c/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.741261] do_sys_poll+0x2f4/0x610
>> Jul  4 20:00:12 pppve1 kernel: [3375931.745587]  ? 
>> __pfx___pollwait+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.750332]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.754900]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.759463]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.764011]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.768617]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.773165]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.777688]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.782156]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.786622]  ? 
>> __pfx_pollwake+0x10/0x10
>> Jul  4 20:00:12 pppve1 kernel: [3375931.791111] 
>> __x64_sys_ppoll+0xde/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.795656] 
>> x64_sys_call+0x1818/0x2480
>> Jul  4 20:00:12 pppve1 kernel: [3375931.800193] do_syscall_64+0x81/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.804485]  ? 
>> __x64_sys_ppoll+0xf2/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.809100]  ? 
>> syscall_exit_to_user_mode+0x86/0x260
>> Jul  4 20:00:12 pppve1 kernel: [3375931.814566]  ? 
>> do_syscall_64+0x8d/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.818979]  ? 
>> syscall_exit_to_user_mode+0x86/0x260
>> Jul  4 20:00:12 pppve1 kernel: [3375931.824425]  ? 
>> do_syscall_64+0x8d/0x170
>> Jul  4 20:00:12 pppve1 kernel: [3375931.828825]  ? 
>> clear_bhb_loop+0x15/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.833211]  ? 
>> clear_bhb_loop+0x15/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.837579]  ? 
>> clear_bhb_loop+0x15/0x70
>> Jul  4 20:00:12 pppve1 kernel: [3375931.841928] 
>> entry_SYSCALL_64_after_hwframe+0x78/0x80
>> Jul  4 20:00:12 pppve1 kernel: [3375931.847482] RIP: 0033:0x765bb1ce8316
>> Jul  4 20:00:12 pppve1 kernel: [3375931.851577] Code: 7c 24 08 e8 2c 
>> 95 f8 ff 4c 8b 54 24 18 48 8b 74 24 10 41 b8 08 00 00 00 41 89 c1 48 
>> 8b 7c 24 08 4c 89 e2 b8 0f 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 
>> 44 89 cf 89 44 24 08 e8 76 95 f8 ff 8b 44
>> Jul  4 20:00:12 pppve1 kernel: [3375931.871194] RSP: 
>> 002b:00007fff2d39ea20 EFLAGS: 00000293 ORIG_RAX: 000000000000010f
>> Jul  4 20:00:12 pppve1 kernel: [3375931.879298] RAX: ffffffffffffffda 
>> RBX: 00006045d3e68470 RCX: 0000765bb1ce8316
>> Jul  4 20:00:12 pppve1 kernel: [3375931.886963] RDX: 00007fff2d39ea40 
>> RSI: 0000000000000010 RDI: 00006045d4de5f20
>> Jul  4 20:00:12 pppve1 kernel: [3375931.894630] RBP: 00007fff2d39eaac 
>> R08: 0000000000000008 R09: 0000000000000000
>> Jul  4 20:00:12 pppve1 kernel: [3375931.902299] R10: 0000000000000000 
>> R11: 0000000000000293 R12: 00007fff2d39ea40
>> Jul  4 20:00:12 pppve1 kernel: [3375931.909951] R13: 00006045d3e68470 
>> R14: 00006045b014d570 R15: 00007fff2d39eab0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.917656]  </TASK>
>> Jul  4 20:00:12 pppve1 kernel: [3375931.920515] Mem-Info:
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] active_anon:4467063 
>> inactive_anon:2449638 isolated_anon:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  active_file:611 
>> inactive_file:303 isolated_file:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] unevictable:39551 
>> dirty:83 writeback:237
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] 
>> slab_reclaimable:434580 slab_unreclaimable:1792355
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  mapped:571491 
>> shmem:581427 pagetables:26365
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] sec_pagetables:11751 
>> bounce:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465] 
>> kernel_misc_reclaimable:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.923465]  free:234516 
>> free_pcp:5874 free_cma:0
>> Jul  4 20:00:12 pppve1 kernel: [3375931.969518] Node 0 
>> active_anon:17033436kB inactive_anon:10633368kB active_file:64kB 
>> inactive_file:3196kB unevictable:158204kB isolated(anon):0kB 
>> isolated(file):0kB mapped:2285988kB dirty:356kB writeback:948kB 
>> shmem:2325708kB shmem_thp:0kB shmem_pmdmapped:0kB anon_thp:866304kB 
>> writeback_tmp:0kB kernel_stack:11520kB pagetables:105460kB 
>> sec_pagetables:47004kB all_unreclaimable? no
>> Jul  4 20:00:12 pppve1 kernel: [3375932.004977] Node 0 DMA 
>> free:11264kB boost:0kB min:12kB low:24kB high:36kB 
>> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB 
>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB 
>> present:15996kB managed:15360kB mlocked:0kB bounce:0kB free_pcp:0kB 
>> local_pcp:0kB free_cma:0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.032646] lowmem_reserve[]: 0 
>> 1527 63844 63844 63844
>> Jul  4 20:00:12 pppve1 kernel: [3375932.038675] Node 0 DMA32 
>> free:252428kB boost:0kB min:1616kB low:3176kB high:4736kB 
>> reserved_highatomic:2048KB active_anon:310080kB 
>> inactive_anon:986436kB active_file:216kB inactive_file:0kB 
>> unevictable:0kB writepending:0kB present:1690624kB managed:1623508kB 
>> mlocked:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.069110] lowmem_reserve[]: 0 0 
>> 62317 62317 62317
>> Jul  4 20:00:12 pppve1 kernel: [3375932.074979] Node 0 Normal 
>> free:814396kB boost:290356kB min:356304kB low:420116kB high:483928kB 
>> reserved_highatomic:346112KB active_anon:11258684kB 
>> inactive_anon:15111580kB active_file:0kB inactive_file:2316kB 
>> unevictable:158204kB writepending:1304kB present:65011712kB 
>> managed:63820796kB mlocked:155132kB bounce:0kB free_pcp:12728kB 
>> local_pcp:0kB free_cma:0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.109188] lowmem_reserve[]: 0 0 
>> 0 0 0
>> Jul  4 20:00:12 pppve1 kernel: [3375932.114119] Node 0 DMA: 0*4kB 
>> 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 
>> 1*2048kB (M) 2*4096kB (M) = 11264kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.127796] Node 0 DMA32: 
>> 5689*4kB (UMH) 1658*8kB (UMH) 381*16kB (UM) 114*32kB (UME) 97*64kB 
>> (UME) 123*128kB (UMEH) 87*256kB (MEH) 96*512kB (UMEH) 58*1024kB (UME) 
>> 5*2048kB (UME) 11*4096kB (ME) = 253828kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.148050] Node 0 Normal: 
>> 16080*4kB (UMEH) 36886*8kB (UMEH) 22890*16kB (UMEH) 4687*32kB (MEH) 
>> 159*64kB (UMEH) 10*128kB (UE) 0*256kB 0*512kB 0*1024kB 0*2048kB 
>> 0*4096kB = 887088kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.165899] Node 0 
>> hugepages_total=0 hugepages_free=0 hugepages_surp=0 
>> hugepages_size=1048576kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.175876] Node 0 
>> hugepages_total=0 hugepages_free=0 hugepages_surp=0 
>> hugepages_size=2048kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.185569] 586677 total 
>> pagecache pages
>> Jul  4 20:00:12 pppve1 kernel: [3375932.190737] 0 pages in swap cache
>> Jul  4 20:00:12 pppve1 kernel: [3375932.195285] Free swap  = 0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.199404] Total swap = 0kB
>> Jul  4 20:00:12 pppve1 kernel: [3375932.203513] 16679583 pages RAM
>> Jul  4 20:00:12 pppve1 kernel: [3375932.207787] 0 pages 
>> HighMem/MovableOnly
>> Jul  4 20:00:12 pppve1 kernel: [3375932.212819] 314667 pages reserved
>> Jul  4 20:00:12 pppve1 kernel: [3375932.217321] 0 pages hwpoisoned
>> Jul  4 20:00:12 pppve1 kernel: [3375932.221525] Tasks state (memory 
>> values in pages):
>> Jul  4 20:00:12 pppve1 kernel: [3375932.227400] [  pid  ]   uid tgid 
>> total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents 
>> oom_score_adj name
>> Jul  4 20:00:12 pppve1 kernel: [3375932.239680] [   1959]   106 
>> 1959     1971      544       96      448         0 61440        
>> 0             0 rpcbind
>> Jul  4 20:00:12 pppve1 kernel: [3375932.251672] [   1982]   104 
>> 1982     2350      672      160      512         0 57344        
>> 0          -900 dbus-daemon
>> Jul  4 20:00:12 pppve1 kernel: [3375932.264020] [   1991]     0 
>> 1991     1767      275       83      192         0 57344        
>> 0             0 ksmtuned
>> Jul  4 20:00:12 pppve1 kernel: [3375932.276094] [   1995]     0 
>> 1995    69541      480       64      416         0 86016        
>> 0             0 pve-lxc-syscall
>> Jul  4 20:00:12 pppve1 kernel: [3375932.289686] [   2002]     0 
>> 2002     1330      384       32      352         0 53248        
>> 0             0 qmeventd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.301811] [   2003]     0 
>> 2003    55449      727      247      480         0 86016        
>> 0             0 rsyslogd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.313919] [   2004]     0 
>> 2004     3008      928      448      480         0 69632        
>> 0             0 smartd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.325834] [   2009]     0 
>> 2009     6386      992      224      768         0 77824        
>> 0             0 systemd-logind
>> Jul  4 20:00:12 pppve1 kernel: [3375932.339438] [   2010]     0 
>> 2010      584      256        0      256         0 40960        
>> 0         -1000 watchdog-mux
>> Jul  4 20:00:12 pppve1 kernel: [3375932.352936] [   2021]     0 
>> 2021    60174      928      256      672         0 90112        
>> 0             0 zed
>> Jul  4 20:00:12 pppve1 kernel: [3375932.364626] [   2136]     0 
>> 2136    75573      256       64      192         0 86016        
>> 0         -1000 lxcfs
>> Jul  4 20:00:12 pppve1 kernel: [3375932.376485] [   2397]     0 
>> 2397     2208      480       64      416         0 61440        
>> 0             0 lxc-monitord
>> Jul  4 20:00:12 pppve1 kernel: [3375932.389169] [   2421]     0 
>> 2421    40673      454       70      384         0 73728        
>> 0             0 apcupsd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.400685] [   2426]     0 
>> 2426     3338      428      172      256         0 69632        
>> 0             0 iscsid
>> Jul  4 20:00:12 pppve1 kernel: [3375932.412121] [   2427]     0 
>> 2427     3464     3343      431     2912         0 77824        
>> 0           -17 iscsid
>> Jul  4 20:00:12 pppve1 kernel: [3375932.423754] [   2433]     0 
>> 2433     3860     1792      320     1472         0 77824        
>> 0         -1000 sshd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.435208] [   2461]     0 
>> 2461   189627     2688     1344     1344         0 155648        
>> 0             0 dsm_ism_srvmgrd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.448290] [   2490]   113 
>> 2490     4721      750      142      608         0 61440        
>> 0             0 chronyd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.459988] [   2492]   113 
>> 2492     2639      502      118      384         0 61440        
>> 0             0 chronyd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.471684] [   2531]     0 
>> 2531     1469      448       32      416         0 49152        
>> 0             0 agetty
>> Jul  4 20:00:12 pppve1 kernel: [3375932.483269] [   2555]     0 
>> 2555   126545      673      244      429         0 147456        
>> 0             0 rrdcached
>> Jul  4 20:00:12 pppve1 kernel: [3375932.483275] [   2582]     0 
>> 2582   155008    15334     3093      864     11377 434176        
>> 0             0 pmxcfs
>> Jul  4 20:00:12 pppve1 kernel: [3375932.506653] [   2654]     0 
>> 2654    10667      614      134      480         0 77824        
>> 0             0 master
>> Jul  4 20:00:12 pppve1 kernel: [3375932.517986] [   2656]   107 
>> 2656    10812      704      160      544         0 73728        
>> 0             0 qmgr
>> Jul  4 20:00:12 pppve1 kernel: [3375932.529118] [   2661]     0 
>> 2661   139892    41669    28417     2980     10272 405504        
>> 0             0 corosync
>> Jul  4 20:00:12 pppve1 kernel: [3375932.540553] [   2662]     0 
>> 2662     1653      576       32      544         0 53248        
>> 0             0 cron
>> Jul  4 20:00:12 pppve1 kernel: [3375932.551657] [   2664]     0 
>> 2664     1621      480       96      384         0 57344        
>> 0             0 proxmox-firewal
>> Jul  4 20:00:12 pppve1 kernel: [3375932.564093] [   3164]     0 
>> 3164    83332    26227    25203      768       256 360448        
>> 0             0 pve-firewall
>> Jul  4 20:00:12 pppve1 kernel: [3375932.576192] [   3233]     0 
>> 3233    85947    28810    27242     1216       352 385024        
>> 0             0 pvestatd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.587638] [   3417]     0 
>> 3417    93674    36011    35531      480         0 438272        
>> 0             0 pvedaemon
>> Jul  4 20:00:12 pppve1 kernel: [3375932.599167] [   3421]     0 
>> 3421    95913    37068    35884     1120        64 454656        
>> 0             0 pvedaemon worke
>> Jul  4 20:00:12 pppve1 kernel: [3375932.611536] [   3424]     0 
>> 3424    96072    36972    35852     1088        32 454656        
>> 0             0 pvedaemon worke
>> Jul  4 20:00:12 pppve1 kernel: [3375932.623977] [   3426]     0 
>> 3426    96167    37068    35948     1056        64 458752        
>> 0             0 pvedaemon worke
>> Jul  4 20:00:12 pppve1 kernel: [3375932.636698] [   3558]     0 
>> 3558    90342    29540    28676      608       256 385024        
>> 0             0 pve-ha-crm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.648477] [   3948]    33 
>> 3948    94022    37705    35849     1856         0 471040        
>> 0             0 pveproxy
>> Jul  4 20:00:12 pppve1 kernel: [3375932.660083] [   3954]    33 
>> 3954    21688    14368    12736     1632         0 221184        
>> 0             0 spiceproxy
>> Jul  4 20:00:12 pppve1 kernel: [3375932.671862] [   3956]     0 
>> 3956    90222    29321    28521      544       256 397312        
>> 0             0 pve-ha-lrm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.683484] [   3994]     0 3994  
>> 1290140   706601   705993      608         0 6389760        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.694551] [   4088]     0 4088  
>> 1271416  1040767  1040223      544         0 8994816        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.705624] [   4160]     0 
>> 4160    89394    30149    29541      608         0 380928        
>> 0             0 pvescheduler
>> Jul  4 20:00:12 pppve1 kernel: [3375932.717864] [   4710]     0 
>> 4710     1375      480       32      448         0 57344        
>> 0             0 agetty
>> Jul  4 20:00:12 pppve1 kernel: [3375932.729183] [   5531]     0 
>> 5531   993913   567351   566647      704         0 5611520        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.740212] [   6368]     0 6368  
>> 5512483  4229046  4228342      704         0 34951168        
>> 0             0 kvm
>> Jul  4 20:00:12 pppve1 kernel: [3375932.751255] [   9796]     0 
>> 9796     1941      768       64      704         0 57344        
>> 0             0 lxc-start
>> Jul  4 20:00:12 pppve1 kernel: [3375932.762840] [   9808] 100000  
>> 9808     3875      160       32      128         0 77824        
>> 0             0 init
>> Jul  4 20:00:12 pppve1 kernel: [3375932.774063] [  11447] 100000 
>> 11447     9272      192       64      128         0 118784        
>> 0             0 rpcbind
>> Jul  4 20:00:12 pppve1 kernel: [3375932.785534] [  11620] 100000 
>> 11620    45718      240      112      128         0 126976        
>> 0             0 rsyslogd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.797241] [  11673] 100000 
>> 11673     4758      195       35      160         0 81920        
>> 0             0 atd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.808516] [  11748] 100000 
>> 11748     6878      228       36      192         0 98304        
>> 0             0 cron
>> Jul  4 20:00:12 pppve1 kernel: [3375932.819868] [  11759] 100102 
>> 11759    10533      257       65      192         0 122880        
>> 0             0 dbus-daemon
>> Jul  4 20:00:12 pppve1 kernel: [3375932.832328] [  11765] 100000 
>> 11765    13797      315      155      160         0 143360        
>> 0             0 sshd
>> Jul  4 20:00:12 pppve1 kernel: [3375932.843547] [  11989] 100104 
>> 11989   565602    19744      288      160     19296 372736        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.855266] [  12169] 100104 
>> 12169   565938   537254      678      192    536384 4517888        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.866950] [  12170] 100104 
>> 12170   565859   199654      550      224    198880 4296704        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.878525] [  12171] 100104 
>> 12171   565859     4710      358      224      4128 241664        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.890252] [  12172] 100104 
>> 12172   565962     7654      518      192      6944 827392        
>> 0             0 postgres
>> Jul  4 20:00:12 pppve1 kernel: [3375932.901845] [  12173] 100104 
>> 12173    20982      742      518      224         0 200704        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375932.913421] [  13520] 100000 
>> 13520     9045      192      128       64         0 114688        
>> 0             0 master
>> Jul  4 20:00:13 pppve1 kernel: [3375932.924809] [  13536] 100100 
>> 13536     9601      320      128      192         0 126976        
>> 0             0 qmgr
>> Jul  4 20:00:13 pppve1 kernel: [3375932.936088] [  13547] 100000 
>> 13547     3168      192       32      160         0 73728        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375932.947424] [  13548] 100000 
>> 13548     3168      160       32      128         0 73728        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375932.958761] [1302486]     0 
>> 1302486     1941      768       96      672         0 53248        
>> 0             0 lxc-start
>> Jul  4 20:00:13 pppve1 kernel: [3375932.970490] [1302506] 100000 
>> 1302506     2115      128       32       96         0 65536        
>> 0             0 init
>> Jul  4 20:00:13 pppve1 kernel: [3375932.981999] [1302829] 100001 
>> 1302829     2081      128        0      128         0 61440        
>> 0             0 portmap
>> Jul  4 20:00:13 pppve1 kernel: [3375932.993763] [1302902] 100000 
>> 1302902    27413      160       64       96         0 122880        
>> 0             0 rsyslogd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.005719] [1302953] 100000 
>> 1302953   117996     1654     1366      227        61 450560        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.017459] [1302989] 100000 
>> 1302989     4736       97       33       64         0 81920        
>> 0             0 atd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.028905] [1303004] 100104 
>> 1303004     5843       64       32       32         0 94208        
>> 0             0 dbus-daemon
>> Jul  4 20:00:13 pppve1 kernel: [3375933.041272] [1303030] 100000 
>> 1303030    12322      334      110      224         0 139264        
>> 0             0 sshd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.052755] [1303048] 100000 
>> 1303048     5664       64       32       32         0 94208        
>> 0             0 cron
>> Jul  4 20:00:13 pppve1 kernel: [3375933.064220] [1303255] 100000 
>> 1303255     9322      224       96      128         0 118784        
>> 0             0 master
>> Jul  4 20:00:13 pppve1 kernel: [3375933.075896] [1303284] 100101 
>> 1303284     9878      352      128      224         0 122880        
>> 0             0 qmgr
>> Jul  4 20:00:13 pppve1 kernel: [3375933.087405] [1303285] 100000 
>> 1303285     1509       32        0       32         0 61440        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375933.099008] [1303286] 100000 
>> 1303286     1509       64        0       64         0 61440        
>> 0             0 getty
>> Jul  4 20:00:13 pppve1 kernel: [3375933.110571] [1420994]    33 
>> 1420994    21749    13271    12759      512         0 204800        
>> 0             0 spiceproxy work
>> Jul  4 20:00:13 pppve1 kernel: [3375933.123378] [1421001]    33 
>> 1421001    94055    37044    35892     1152         0 434176        
>> 0             0 pveproxy worker
>> Jul  4 20:00:13 pppve1 kernel: [3375933.136284] [1421002]    33 
>> 1421002    94055    36980    35860     1120         0 434176        
>> 0             0 pveproxy worker
>> Jul  4 20:00:13 pppve1 kernel: [3375933.149173] [1421003]    33 
>> 1421003    94055    37044    35892     1152         0 434176        
>> 0             0 pveproxy worker
>> Jul  4 20:00:13 pppve1 kernel: [3375933.162040] [2316827]     0 
>> 2316827     6820     1088      224      864         0 69632        
>> 0         -1000 systemd-udevd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.174778] [2316923]     0 
>> 2316923    51282     2240      224     2016         0 438272        
>> 0          -250 systemd-journal
>> Jul  4 20:00:13 pppve1 kernel: [3375933.187768] [3148356]     0 
>> 3148356    32681    21120    19232     1888         0 249856        
>> 0             0 glpi-agent (tag
>> Jul  4 20:00:13 pppve1 kernel: [3375933.200481] [3053571]     0 
>> 3053571    19798      480       32      448         0 57344        
>> 0             0 pvefw-logger
>> Jul  4 20:00:13 pppve1 kernel: [3375933.212970] [3498513] 100033 
>> 3498513   119792     7207     2632      223      4352 516096        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.224713] [3498820] 100104 
>> 3498820   575918   235975     9351      160    226464 3424256        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.236579] [3500997] 100033 
>> 3500997   119889     7202     2594      192      4416 524288        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.248240] [3501657] 100104 
>> 3501657   571325   199025     6001      160    192864 2945024        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.260100] [3502514] 100033 
>> 3502514   119119     5907     2004      191      3712 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.271772] [3503679] 100104 
>> 3503679   575295   211508     6612      192    204704 2953216        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.283619] [3515234] 100033 
>> 3515234   119042     6568     1960      192      4416 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.295362] [3515420] 100104 
>> 3515420   569839    97579     4491      160     92928 2293760        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.307155] [3520282] 100033 
>> 3520282   119129     5416     2056      192      3168 495616        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.318923] [3520287] 100033 
>> 3520287   119015     5709     1894      167      3648 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.330805] [3520288] 100033 
>> 3520288   119876     5961     2729      224      3008 507904        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.342648] [3521057] 100104 
>> 3521057   573824    46069     8341      128     37600 1830912        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.354567] [3521067] 100104 
>> 3521067   574768    99734     7446       96     92192 2134016        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.366512] [3521301] 100104 
>> 3521301   569500   174722     4194      160    170368 2482176        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.378484] [3532810] 100033 
>> 3532810   118740     4127     1727      160      2240 479232        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.390140] [3532933] 100033 
>> 3532933   118971     5064     1864      160      3040 503808        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.401854] [3534151] 100104 
>> 3534151   567344   168822     1686      160    166976 2408448        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.413852] [3535832] 100104 
>> 3535832   569005    41042     2578      128     38336 1150976        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.425919] [3550993] 100033 
>> 3550993   118029     1768     1544      224         0 425984        
>> 0             0 apache2
>> Jul  4 20:00:13 pppve1 kernel: [3375933.437868] [3560475]   107 
>> 3560475    10767      928      160      768         0 77824        
>> 0             0 pickup
>> Jul  4 20:00:13 pppve1 kernel: [3375933.449513] [3563017] 100101 
>> 3563017     9838      256       96      160         0 122880        
>> 0             0 pickup
>> Jul  4 20:00:13 pppve1 kernel: [3375933.461255] [3575085] 100100 
>> 3575085     9561      288      128      160         0 118784        
>> 0             0 pickup
>> Jul  4 20:00:13 pppve1 kernel: [3375933.473119] [3579986]     0 
>> 3579986     1367      384        0      384         0 49152        
>> 0             0 sleep
>> Jul  4 20:00:13 pppve1 kernel: [3375933.484646] [3579996] 100104 
>> 3579996   566249     5031      615      128      4288 450560        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.496645] [3580020]     0 
>> 3580020    91269    30310    29606      704         0 409600        
>> 0             0 pvescheduler
>> Jul  4 20:00:13 pppve1 kernel: [3375933.509585] [3580041]     0 
>> 3580041     5005     1920      640     1280         0 81920        
>> 0           100 systemd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.521297] [3580044]     0 
>> 3580044    42685     1538     1218      320         0 102400        
>> 0           100 (sd-pam)
>> Jul  4 20:00:13 pppve1 kernel: [3375933.533226] [3580125] 100104 
>> 3580125   566119     5607      583      704      4320 446464        
>> 0             0 postgres
>> Jul  4 20:00:13 pppve1 kernel: [3375933.545245] [3580193]     0 
>> 3580193     4403     2368      384     1984         0 81920        
>> 0             0 sshd
>> Jul  4 20:00:13 pppve1 kernel: [3375933.556849] 
>> oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=qemu.slice,mems_allowed=0,global_oom,task_memcg=/qemu.slice/121.scope,task=kvm,pid=6368,uid=0
>> Jul  4 20:00:13 pppve1 kernel: [3375933.573133] Out of memory: Killed 
>> process 6368 (kvm) total-vm:22049932kB, anon-rss:16913368kB, 
>> file-rss:2944kB, shmem-rss:0kB, UID:0 pgtables:34132kB oom_score_adj:0
>> Jul  4 20:00:15 pppve1 kernel: [3375935.378441]  zd16: p1 p2 p3 < p5 
>> p6 >
>> Jul  4 20:00:16 pppve1 kernel: [3375936.735383] oom_reaper: reaped 
>> process 6368 (kvm), now anon-rss:0kB, file-rss:32kB, shmem-rss:0kB
>> Jul  4 20:01:11 pppve1 kernel: [3375991.767379] vmbr0: port 
>> 5(tap121i0) entered disabled state
>> Jul  4 20:01:11 pppve1 kernel: [3375991.778143] tap121i0 
>> (unregistering): left allmulticast mode
>> Jul  4 20:01:11 pppve1 kernel: [3375991.785976] vmbr0: port 
>> 5(tap121i0) entered disabled state
>> Jul  4 20:01:11 pppve1 kernel: [3375991.791555]  zd128: p1
>> Jul  4 20:01:13 pppve1 kernel: [3375993.594688]  zd176: p1 p2
>>
>>
>>> Makes few sense that OOM triggers in 64GB hosts with just 24GB 
>>> configured in
>>> VMs and, probably, less real usage. IMHO it's not VMs what fill your 
>>> memory
>>> up to the point of OOM, but some other process, ZFS ARC, maybe even 
>>> some mem
>>> leak. Maybe some process is producing severe memory fragmentation.
>> i can confirm that server was doing some heavy I/O (backup), but AFAIK
>> nothing more.
>>
>>
>> Mandi! Roland
>>
>>> it's a little bit weird that OOM kicks in with VMs <32GB RAM when 
>>> you have 64GB
>>> take a closer look why this happens , i.e. why OOM thinks there is 
>>> ram pressure
>> effectively server was running:
>>   + vm 100, 2GB
>>   + vm 120, 4GB
>>   + vm 121, 16GB
>>   + vm 127, 4GB
>>   + lxc 124, 2GB
>>   + lxc 125, 4GB
>>
>> so exactly 32GB of RAM. But most of the VM/LXC barely arrived at half 
>> of the
>> allocated RAM...
>>
>>
>>
>> Thanks.
>>

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-10  9:08       ` Roland via pve-user
@ 2025-07-10 14:49         ` dorsy via pve-user
  2025-07-10 16:11           ` Roland via pve-user
                             ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: dorsy via pve-user @ 2025-07-10 14:49 UTC (permalink / raw)
  To: pve-user; +Cc: dorsy

[-- Attachment #1: Type: message/rfc822, Size: 7780 bytes --]

From: dorsy <dorsyka@yahoo.com>
To: pve-user@lists.proxmox.com
Subject: Re: [PVE-User] A less aggressive OOM?
Date: Thu, 10 Jul 2025 16:49:34 +0200
Message-ID: <caa07d1c-6898-434a-85f6-274b1511ed06@yahoo.com>


On 7/10/2025 11:08 AM, Roland via pve-user wrote:
if OOM kicks in because half of the ram is being used for 
caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
should be resolved at zfs or memory management level.

Absolutely no!
You are responsible for giving ZFS the limits. As even described in the 
proxmox documentation here:
https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage

> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user



[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-10 14:49         ` dorsy via pve-user
@ 2025-07-10 16:11           ` Roland via pve-user
       [not found]           ` <98ace9cf-a47f-40cd-8796-6bec3558ebb0@web.de>
  2025-07-13 14:28           ` Marco Gaiarin
  2 siblings, 0 replies; 10+ messages in thread
From: Roland via pve-user @ 2025-07-10 16:11 UTC (permalink / raw)
  To: Proxmox VE user list; +Cc: Roland

[-- Attachment #1: Type: message/rfc822, Size: 10695 bytes --]

From: Roland <devzero@web.de>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] A less aggressive OOM?
Date: Thu, 10 Jul 2025 18:11:41 +0200
Message-ID: <98ace9cf-a47f-40cd-8796-6bec3558ebb0@web.de>

imho, killing processes because of arc using too much ram which can't be 
reclaimed fast enough is a failure in overall memory coordination.

we can set zfs limits as a workaround, yes - but zfs and oomkiller is to 
blame !!!

1. zfs should free up memory faster, as memory is also freed from 
buffers/caches

2. oomkiller should put pressure on arc or try reclaim pages from that 
first, instead of killing kvm processes.  maybe oomkiller could be made 
arc-aware!?

roland


 >On 7/10/2025 11:08 AM, Roland via pve-user wrote:
 >if OOM kicks in because half of the ram is being used for 
caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
should be resolved at zfs or memory management level.

 >Absolutely no!
 >You are responsible for giving ZFS the limits. As even described in 
the proxmox documentation here:
 >https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage

> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user



_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user


Am 10.07.25 um 16:49 schrieb dorsy via pve-user:
> On 7/10/2025 11:08 AM, Roland via pve-user wrote:
> if OOM kicks in because half of the ram is being used for 
> caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
> should be resolved at zfs or memory management level.
>
> Absolutely no!
> You are responsible for giving ZFS the limits. As even described in 
> the proxmox documentation here:
> https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
>
>> _______________________________________________
>> pve-user mailing list
>> pve-user@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
>
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
       [not found]           ` <98ace9cf-a47f-40cd-8796-6bec3558ebb0@web.de>
@ 2025-07-10 16:15             ` dorsy via pve-user
  0 siblings, 0 replies; 10+ messages in thread
From: dorsy via pve-user @ 2025-07-10 16:15 UTC (permalink / raw)
  To: Proxmox VE user list; +Cc: dorsy

[-- Attachment #1: Type: message/rfc822, Size: 9458 bytes --]

From: dorsy <dorsyka@yahoo.com>
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] A less aggressive OOM?
Date: Thu, 10 Jul 2025 18:15:55 +0200
Message-ID: <2a7aed75-5870-4e23-994a-6f78bbe9dd85@yahoo.com>

You did overcommit memory by not setting appropriate ZFS limits.

So OOMkiller saved Your machine from hangup in a memory constraint 
situation. That's easy like that.

Read The Fine Manual!

On 7/10/2025 6:11 PM, Roland wrote:
> imho, killing processes because of arc using too much ram which can't 
> be reclaimed fast enough is a failure in overall memory coordination.
>
> we can set zfs limits as a workaround, yes - but zfs and oomkiller is 
> to blame !!!
>
> 1. zfs should free up memory faster, as memory is also freed from 
> buffers/caches
>
> 2. oomkiller should put pressure on arc or try reclaim pages from that 
> first, instead of killing kvm processes.  maybe oomkiller could be 
> made arc-aware!?
>
> roland
>
>
> >On 7/10/2025 11:08 AM, Roland via pve-user wrote:
> >if OOM kicks in because half of the ram is being used for 
> caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
> should be resolved at zfs or memory management level.
>
> >Absolutely no!
> >You are responsible for giving ZFS the limits. As even described in 
> the proxmox documentation here:
> >https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage 
>
>
>> _______________________________________________
>> pve-user mailing list
>> pve-user@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
>
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
> Am 10.07.25 um 16:49 schrieb dorsy via pve-user:
>> On 7/10/2025 11:08 AM, Roland via pve-user wrote:
>> if OOM kicks in because half of the ram is being used for 
>> caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
>> should be resolved at zfs or memory management level.
>>
>> Absolutely no!
>> You are responsible for giving ZFS the limits. As even described in 
>> the proxmox documentation here:
>> https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage 
>>
>>
>>> _______________________________________________
>>> pve-user mailing list
>>> pve-user@lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

-- 
Üdvözlettel,
Dorotovics László
rendszergazda
IKRON Fejlesztő és Szolgáltató Kft.
Székhely: 6721 Szeged, Szilágyi utca 5-1.

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PVE-User] A less aggressive OOM?
  2025-07-10 14:49         ` dorsy via pve-user
  2025-07-10 16:11           ` Roland via pve-user
       [not found]           ` <98ace9cf-a47f-40cd-8796-6bec3558ebb0@web.de>
@ 2025-07-13 14:28           ` Marco Gaiarin
  2 siblings, 0 replies; 10+ messages in thread
From: Marco Gaiarin @ 2025-07-13 14:28 UTC (permalink / raw)
  To: dorsy via pve-user; +Cc: pve-user

Mandi! dorsy via pve-user
  In chel di` si favelave...

Thanks to all, particulary to Victor for the wonderful analisys, that lead
me to learn a bit better OOM dump...

>> if OOM kicks in because half of the ram is being used for 
>> caches/buffers, i would blame OOMkiller or ZFS for tha. The problem 
>> should be resolved at zfs or memory management level.

> Absolutely no!
> You are responsible for giving ZFS the limits. As even described in the 
> proxmox documentation here:
> https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage

I'm a bit in the side of Roland on this. ARC is a (indeed, complex)
buffer/cache, so seems reasonably that, if i need to sacrifice something, it
is better to sacrifice cache than VM.

Aniway, if i understood well, default ZFS was to have ARC at 50% of the RAM;
after PVE 8.1, PVE modify the default to 10% (for new installation); there's
also a 'rule of thumb' to setup ARC, so 10% is somewhat a 'starting point'.


In some server i can setup easily swap (i have a disk for an L2ARC, so i can
simply detach, partition a bit and reattach as L2ARC and swap).
Clearly, i'll set swappiness at 1, to be used only when strictly needed.


Thanks to all!

-- 



_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2025-07-13 14:39 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-07-07  9:26 [PVE-User] A less aggressive OOM? Marco Gaiarin
2025-07-07 21:39 ` Victor Rodriguez
2025-07-08 16:31   ` Marco Gaiarin
2025-07-10  8:56     ` Victor Rodriguez
2025-07-10  9:08       ` Roland via pve-user
2025-07-10 14:49         ` dorsy via pve-user
2025-07-10 16:11           ` Roland via pve-user
     [not found]           ` <98ace9cf-a47f-40cd-8796-6bec3558ebb0@web.de>
2025-07-10 16:15             ` dorsy via pve-user
2025-07-13 14:28           ` Marco Gaiarin
2025-07-08 12:05 ` Roland via pve-user

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal