From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 4A75283FDC for ; Fri, 10 Dec 2021 10:24:58 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 3519B2391C for ; Fri, 10 Dec 2021 10:24:28 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 84CC02390F for ; Fri, 10 Dec 2021 10:24:25 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 25E11450CD for ; Fri, 10 Dec 2021 10:24:19 +0100 (CET) From: Fabian Ebner To: pve-devel@lists.proxmox.com Date: Fri, 10 Dec 2021 10:24:13 +0100 Message-Id: <20211210092413.4654-1-f.ebner@proxmox.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.470 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% HEXHASH_WORD 1 Multiple instances of word + hexadecimal hash KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment KAM_LOTSOFHASH 0.25 Emails with lots of hash-like gibberish SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [freedesktop.org, proxmox.com, launchpad.net] Subject: [pve-devel] [PATCH pve-kernel] cherry-pick/backport amd{gpu, _sfh} fixes from ubuntu-jammy X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Dec 2021 09:24:58 -0000 Some users reported boot failures after updating to the latest 5.13 kernel[0] because of a crash in amdgpu. The patch drm/amdgpu: fix uvd crash on Polaris12 during driver unloading fixes d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload") which is present as a backport 838dfb5888ff in the impish tree. As this is a supplement to the original one, fixing a crash with a similar backtrace as the ones in the forum thread[0], this seems to be the most promising. The patch drm/amd/pm: avoid duplicate powergate/ungate setting is related as it fixes bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") which is the same commit that was fixed by 838dfb5888ff and has a Cc for stable. A very slight adaptation of the surrounding code was necessary for the patch to apply. The patch drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga and dvi connectors is likely not related, but it seems simply enough, has a Cc for stable and applied cleanly. The patch (with the same title as the one it fixes) HID: amd_sfh: Fix potential NULL pointer dereference fixes d46ef750ed58 ("HID: amd_sfh: Fix potential NULL pointer dereference") which is present as a backport 56559d7910e7 in the impish tree and seems like the most likely culprit for a different issue reported in the same forum thread[1]. A very slight adaptation of the surrounding code was necessary for the patch to apply. [0]: https://forum.proxmox.com/threads/100825/ [1]: https://forum.proxmox.com/threads/100825/post-435329 Signed-off-by: Fabian Ebner --- ...x-potential-NULL-pointer-dereference.patch | 52 ++++++++ ...vd-crash-on-Polaris12-during-driver-.patch | 71 +++++++++++ ...et-scaling-mode-Full-Full-aspect-Cen.patch | 45 +++++++ ...d-duplicate-powergate-ungate-setting.patch | 119 ++++++++++++++++++ 4 files changed, 287 insertions(+) create mode 100644 patches/kernel/0011-HID-amd_sfh-Fix-potential-NULL-pointer-dereference.patch create mode 100644 patches/kernel/0012-drm-amdgpu-fix-uvd-crash-on-Polaris12-during-driver-.patch create mode 100644 patches/kernel/0013-drm-amdgpu-fix-set-scaling-mode-Full-Full-aspect-Cen.patch create mode 100644 patches/kernel/0014-drm-amd-pm-avoid-duplicate-powergate-ungate-setting.patch diff --git a/patches/kernel/0011-HID-amd_sfh-Fix-potential-NULL-pointer-dereference.patch b/patches/kernel/0011-HID-amd_sfh-Fix-potential-NULL-pointer-dereference.patch new file mode 100644 index 0000000..993328e --- /dev/null +++ b/patches/kernel/0011-HID-amd_sfh-Fix-potential-NULL-pointer-dereference.patch @@ -0,0 +1,52 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Basavaraj Natikar +Date: Thu, 23 Sep 2021 17:59:27 +0530 +Subject: [PATCH] HID: amd_sfh: Fix potential NULL pointer dereference + +The cl_data field of a privdata must be allocated and updated before +using in amd_sfh_hid_client_init() function. + +Hence handling NULL pointer cl_data accordingly. + +Fixes: d46ef750ed58 ("HID: amd_sfh: Fix potential NULL pointer dereference") +Signed-off-by: Basavaraj Natikar +Signed-off-by: Jiri Kosina +[trivial backport] +Signed-off-by: Fabian Ebner +--- + drivers/hid/amd-sfh-hid/amd_sfh_pcie.c | 12 ++++-------- + 1 file changed, 4 insertions(+), 8 deletions(-) + +diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c +index 9a1824757aae..05c007b213f2 100644 +--- a/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c ++++ b/drivers/hid/amd-sfh-hid/amd_sfh_pcie.c +@@ -235,21 +235,17 @@ static int amd_mp2_pci_probe(struct pci_dev *pdev, const struct pci_device_id *i + return rc; + } + +- rc = amd_sfh_hid_client_init(privdata); +- if (rc) +- return rc; +- + privdata->cl_data = devm_kzalloc(&pdev->dev, sizeof(struct amdtp_cl_data), GFP_KERNEL); + if (!privdata->cl_data) + return -ENOMEM; + +- rc = devm_add_action_or_reset(&pdev->dev, amd_mp2_pci_remove, privdata); ++ mp2_select_ops(privdata); ++ ++ rc = amd_sfh_hid_client_init(privdata); + if (rc) + return rc; + +- mp2_select_ops(privdata); +- +- return 0; ++ return devm_add_action_or_reset(&pdev->dev, amd_mp2_pci_remove, privdata); + } + + static const struct pci_device_id amd_mp2_pci_tbl[] = { +-- +2.30.2 + diff --git a/patches/kernel/0012-drm-amdgpu-fix-uvd-crash-on-Polaris12-during-driver-.patch b/patches/kernel/0012-drm-amdgpu-fix-uvd-crash-on-Polaris12-during-driver-.patch new file mode 100644 index 0000000..59a4f57 --- /dev/null +++ b/patches/kernel/0012-drm-amdgpu-fix-uvd-crash-on-Polaris12-during-driver-.patch @@ -0,0 +1,71 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Evan Quan +Date: Sat, 9 Oct 2021 17:35:36 +0800 +Subject: [PATCH] drm/amdgpu: fix uvd crash on Polaris12 during driver + unloading + +BugLink: https://bugs.launchpad.net/bugs/1951822 + +[ Upstream commit 4fc30ea780e0a5c1c019bc2e44f8523e1eed9051 ] + +There was a change(below) target for such issue: +d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload") +But the fix for VI ASICs was missing there. This is a supplement for +that. + +Fixes: d82e2c249c8f ("drm/amdgpu: Fix crash on device remove/driver unload") + +Signed-off-by: Evan Quan +Acked-by: Alex Deucher +Signed-off-by: Alex Deucher +Signed-off-by: Sasha Levin +Signed-off-by: Paolo Pisati +--- + drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c | 24 +++++++++++++----------- + 1 file changed, 13 insertions(+), 11 deletions(-) + +diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c +index bc571833632e..72f876290768 100644 +--- a/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c ++++ b/drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c +@@ -543,6 +543,19 @@ static int uvd_v6_0_hw_fini(void *handle) + { + struct amdgpu_device *adev = (struct amdgpu_device *)handle; + ++ cancel_delayed_work_sync(&adev->uvd.idle_work); ++ ++ if (RREG32(mmUVD_STATUS) != 0) ++ uvd_v6_0_stop(adev); ++ ++ return 0; ++} ++ ++static int uvd_v6_0_suspend(void *handle) ++{ ++ int r; ++ struct amdgpu_device *adev = (struct amdgpu_device *)handle; ++ + /* + * Proper cleanups before halting the HW engine: + * - cancel the delayed idle work +@@ -567,17 +580,6 @@ static int uvd_v6_0_hw_fini(void *handle) + AMD_CG_STATE_GATE); + } + +- if (RREG32(mmUVD_STATUS) != 0) +- uvd_v6_0_stop(adev); +- +- return 0; +-} +- +-static int uvd_v6_0_suspend(void *handle) +-{ +- int r; +- struct amdgpu_device *adev = (struct amdgpu_device *)handle; +- + r = uvd_v6_0_hw_fini(adev); + if (r) + return r; +-- +2.30.2 + diff --git a/patches/kernel/0013-drm-amdgpu-fix-set-scaling-mode-Full-Full-aspect-Cen.patch b/patches/kernel/0013-drm-amdgpu-fix-set-scaling-mode-Full-Full-aspect-Cen.patch new file mode 100644 index 0000000..b904bbd --- /dev/null +++ b/patches/kernel/0013-drm-amdgpu-fix-set-scaling-mode-Full-Full-aspect-Cen.patch @@ -0,0 +1,45 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: hongao +Date: Thu, 11 Nov 2021 11:32:07 +0800 +Subject: [PATCH] drm/amdgpu: fix set scaling mode Full/Full aspect/Center not + works on vga and dvi connectors + +BugLink: https://bugs.launchpad.net/bugs/1952579 + +commit bf552083916a7f8800477b5986940d1c9a31b953 upstream. + +amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode +which assign amdgpu_encoder->native_mode with *preferred_mode result in +amdgpu_encoder->native_mode.clock always be 0. That will cause +amdgpu_connector_set_property returned early on: +if ((rmx_type != DRM_MODE_SCALE_NONE) && + (amdgpu_encoder->native_mode.clock == 0)) +when we try to set scaling mode Full/Full aspect/Center. +Add the missing function to amdgpu_connector_vga_get_mode can fix this. +It also works on dvi connectors because +amdgpu_connector_dvi_helper_funcs.get_mode use the same method. + +Signed-off-by: hongao +Signed-off-by: Alex Deucher +Cc: stable@vger.kernel.org +Signed-off-by: Greg Kroah-Hartman +Signed-off-by: Andrea Righi +--- + drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 1 + + 1 file changed, 1 insertion(+) + +diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c +index b9c11c2b2885..0de66f59adb8 100644 +--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c ++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c +@@ -827,6 +827,7 @@ static int amdgpu_connector_vga_get_modes(struct drm_connector *connector) + + amdgpu_connector_get_edid(connector); + ret = amdgpu_connector_ddc_get_modes(connector); ++ amdgpu_get_native_mode(connector); + + return ret; + } +-- +2.30.2 + diff --git a/patches/kernel/0014-drm-amd-pm-avoid-duplicate-powergate-ungate-setting.patch b/patches/kernel/0014-drm-amd-pm-avoid-duplicate-powergate-ungate-setting.patch new file mode 100644 index 0000000..8e638ae --- /dev/null +++ b/patches/kernel/0014-drm-amd-pm-avoid-duplicate-powergate-ungate-setting.patch @@ -0,0 +1,119 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Evan Quan +Date: Fri, 5 Nov 2021 15:25:30 +0800 +Subject: [PATCH] drm/amd/pm: avoid duplicate powergate/ungate setting + +BugLink: https://bugs.launchpad.net/bugs/1952579 + +commit 6ee27ee27ba8b2e725886951ba2d2d87f113bece upstream. + +Just bail out if the target IP block is already in the desired +powergate/ungate state. This can avoid some duplicate settings +which sometimes may cause unexpected issues. + +Link: https://lore.kernel.org/all/YV81vidWQLWvATMM@zn.tnic/ +Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 +Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 +Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 +Fixes: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") +Signed-off-by: Evan Quan +Tested-by: Borislav Petkov +Reviewed-by: Lijo Lazar +Signed-off-by: Alex Deucher +Cc: stable@vger.kernel.org +Signed-off-by: Greg Kroah-Hartman +Signed-off-by: Andrea Righi +[trivial backport] +Signed-off-by: Fabian Ebner +--- + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ + drivers/gpu/drm/amd/include/amd_shared.h | 3 ++- + drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 10 ++++++++++ + drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 8 ++++++++ + 4 files changed, 23 insertions(+), 1 deletion(-) + +diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +index c1e34aa5925b..96ca42bcfdbf 100644 +--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c ++++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +@@ -3387,6 +3387,9 @@ int amdgpu_device_init(struct amdgpu_device *adev, + adev->rmmio_size = pci_resource_len(adev->pdev, 2); + } + ++ for (i = 0; i < AMD_IP_BLOCK_TYPE_NUM; i++) ++ atomic_set(&adev->pm.pwr_state[i], POWER_STATE_UNKNOWN); ++ + adev->rmmio = ioremap(adev->rmmio_base, adev->rmmio_size); + if (adev->rmmio == NULL) { + return -ENOMEM; +diff --git a/drivers/gpu/drm/amd/include/amd_shared.h b/drivers/gpu/drm/amd/include/amd_shared.h +index 257f280d3d53..bd077ea224a4 100644 +--- a/drivers/gpu/drm/amd/include/amd_shared.h ++++ b/drivers/gpu/drm/amd/include/amd_shared.h +@@ -97,7 +97,8 @@ enum amd_ip_block_type { + AMD_IP_BLOCK_TYPE_ACP, + AMD_IP_BLOCK_TYPE_VCN, + AMD_IP_BLOCK_TYPE_MES, +- AMD_IP_BLOCK_TYPE_JPEG ++ AMD_IP_BLOCK_TYPE_JPEG, ++ AMD_IP_BLOCK_TYPE_NUM, + }; + + enum amd_clockgating_state { +diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c +index 03581d5b1836..08362d506534 100644 +--- a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c ++++ b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c +@@ -927,6 +927,13 @@ int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, uint32_t block + { + int ret = 0; + const struct amd_pm_funcs *pp_funcs = adev->powerplay.pp_funcs; ++ enum ip_power_state pwr_state = gate ? POWER_STATE_OFF : POWER_STATE_ON; ++ ++ if (atomic_read(&adev->pm.pwr_state[block_type]) == pwr_state) { ++ dev_dbg(adev->dev, "IP block%d already in the target %s state!", ++ block_type, gate ? "gate" : "ungate"); ++ return 0; ++ } + + switch (block_type) { + case AMD_IP_BLOCK_TYPE_UVD: +@@ -979,6 +986,9 @@ int amdgpu_dpm_set_powergating_by_smu(struct amdgpu_device *adev, uint32_t block + break; + } + ++ if (!ret) ++ atomic_set(&adev->pm.pwr_state[block_type], pwr_state); ++ + return ret; + } + +diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h +index 98f1b3d8c1d5..16e3f72d31b9 100644 +--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h ++++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h +@@ -417,6 +417,12 @@ struct amdgpu_dpm { + enum amd_dpm_forced_level forced_level; + }; + ++enum ip_power_state { ++ POWER_STATE_UNKNOWN, ++ POWER_STATE_ON, ++ POWER_STATE_OFF, ++}; ++ + struct amdgpu_pm { + struct mutex mutex; + u32 current_sclk; +@@ -451,6 +457,8 @@ struct amdgpu_pm { + /* Used for I2C access to various EEPROMs on relevant ASICs */ + struct i2c_adapter smu_i2c; + struct list_head pm_attr_list; ++ ++ atomic_t pwr_state[AMD_IP_BLOCK_TYPE_NUM]; + }; + + #define R600_SSTU_DFLT 0 +-- +2.30.2 + -- 2.30.2