From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [IPv6:2a01:7e0:0:424::9]) by lore.proxmox.com (Postfix) with ESMTPS id 0D2141FF39D for ; Tue, 11 Jun 2024 16:08:21 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 0A62F318; Tue, 11 Jun 2024 16:08:55 +0200 (CEST) Message-ID: <4be51ab6-c7c2-4d69-8494-af9555cf7c49@proxmox.com> Date: Tue, 11 Jun 2024 16:08:51 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Proxmox VE development discussion , Friedrich Weber References: <20240409124825.1809359-1-f.weber@proxmox.com> Content-Language: en-US From: Fiona Ebner In-Reply-To: <20240409124825.1809359-1-f.weber@proxmox.com> X-SPAM-LEVEL: Spam detection results: 0 AWL -0.059 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [RFC pve-kernel] split lock detection: warn more prominently about misery mode X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Proxmox VE development discussion Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: pve-devel-bounces@lists.proxmox.com Sender: "pve-devel" Am 09.04.24 um 14:48 schrieb Friedrich Weber: > On processors with the `split_lock_detect` feature, the kernel will > detect user-space split locks and, by default, apply a "misery mode" > to offending tasks [1]. When a split lock is detected, the misery mode > artificially slows down the task by forcing a 10ms and serialization > with other offending tasks. This can also apply to vCPU threads of VM > guests that perform unaligned memory access. In this case, the misery > mode can result in temporary vCPU freezes of 10ms or more. > > Currently, correlating observed vCPU freezes with split lock detection > is hard for users: Split lock detection does log a warning, but the > warning does not mention the slowdown, and is logged only once for > each task, which makes troubleshooting hard in case of long-running > VMs. Currently, vCPU threads of OVMF VMs seem to already produce one > such warning on boot, which masks any split locks taken later by the > guest OS. > > To ease troubleshooting, add a patch that warns more prominently about > misery mode. With this patch, the kernel warns every time a task is > artificially slowed down. Even though the warnings are rate-limited by > the `printk_ratelimit` mechanism, there is a risk of large volume of > warnings, but this seems preferable to the misery mode going > unnoticed. > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/kernel/cpu/intel.c?id=727209376f49 > > Signed-off-by: Friedrich Weber > --- > > Notes: > With this patch applied, I see a risk that some users will get a *lot* > of warnings about misery mode. By default, these are rate-limited to > 10 messages per 5 seconds, but depending on the workload, this can > still mean a lot of messages. This is good for driving user > attention towards the split locks, but still, I wonder whether this > might spam the journal too much. What do you think? > I know this is an upper limit, but 10 messages per 5 seconds is too much IMHO. In particular, because certain guest OSes cannot be changed to avoid taking split locks and users will get scared by these messages. Unfortunately, reported_split_lock is just a flag and cannot be used as a counter to log every N taken split locks, which for the right N might be closer to what we/users actually want. Turning the flag into a counter would mean wasting more space in the "current" struct, not sure if that's worth it. > Also, currently there is no option to stop the warnings without > disabling misery mode (or split lock detection in general). Should > there be such an option? I could imagine making the > `split_lock_mitigate` sysctl ternary: > > 2: Enable misery mode and warn every time (new behavior, new default) > 1: Enable misery mode but warn only once (old behavior) > 0: Disable misery mode and warn only once > > ... or even completely decouple misery mode and warning behavior? > What do you think? > If we go for the current patch, I'm in favor of having a switch to turn the frequent warnings off. I feel like it should be a separate switch then. > ...lways-warn-when-applying-misery-mode.patch | 67 +++++++++++++++++++ > 1 file changed, 67 insertions(+) > create mode 100644 patches/kernel/0012-x86-split_lock-always-warn-when-applying-misery-mode.patch > > diff --git a/patches/kernel/0012-x86-split_lock-always-warn-when-applying-misery-mode.patch b/patches/kernel/0012-x86-split_lock-always-warn-when-applying-misery-mode.patch > new file mode 100644 > index 0000000..658de5e > --- /dev/null > +++ b/patches/kernel/0012-x86-split_lock-always-warn-when-applying-misery-mode.patch > @@ -0,0 +1,67 @@ > +From f8db8223735c529a1dcaaf41217024ee4091a464 Mon Sep 17 00:00:00 2001 > +From: Friedrich Weber > +Date: Tue, 9 Apr 2024 10:57:43 +0200 > +Subject: [PATCH] x86/split_lock: always warn when applying misery to task > + > +Since commits b041b525dab9 ("x86/split_lock: Make life miserable for > +split lockers") and 727209376f49 ("x86/split_lock: Add sysctl to > +control the misery mode"), split lock detection artificially slows > +down the offending task ("misery mode") if the `split_lock_mitigate` > +sysctl is enabled (which it is by default). The corresponding warning > +does not mention the slowdown, and is logged only once for each task. > +In case of long-running VMs, this makes it hard to correlate observed > +vCPU freezes to the split lock detection misery mode. > + A start would be to extend the warning to mention that it won't be logged again and there can be slowdowns. Maybe you can also ask upstream for feedback and suggestions. > +To make troubleshooting easier, change the warning behavior: > + > +- If the `split_lock_mitigate` sysctl is enabled, warn every time > + misery is applied to a task, and rephrase the warning. Even though > + the warnings are rate-limited by the `printk_ratelimit` mechanism, > + this may result in a large volume of warnings for split-lock-heavy > + workloads, but this seems preferable to the misery mode going > + unnoticed. > +- If the `split_lock_mitigate` sysctl is disabled, warn only once (as > + before), but mention explicitly that no further action is taken and > + subsequent traps will not be logged. > + > +Signed-off-by: Friedrich Weber As for the patch itself: Reviewed-by: Fiona Ebner _______________________________________________ pve-devel mailing list pve-devel@lists.proxmox.com https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel