public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH pve-kernel] cherry-pick fixes for AMD Epyc genua systems
@ 2025-04-10 13:08 Stoiko Ivanov
  2025-04-10 18:51 ` [pve-devel] applied: " Thomas Lamprecht
  0 siblings, 1 reply; 2+ messages in thread
From: Stoiko Ivanov @ 2025-04-10 13:08 UTC (permalink / raw)
  To: pve-devel

both patches are queued for 6.14.2:
https://lore.kernel.org/all/20250409115934.968141886@linuxfoundation.org/
issue was reported in our community forum:
https://forum.proxmox.com/threads/.164497/post-762617

as we have access to a server where we could reproduce the issue
(crash+loop, before the system was up[0]) I tested with those 2
a kernel with those 2 patches applied - and the system booted
successfully.

FWIW: I tried building with the original series as well (containing a
removal of some PCI-ids), and it also resolved the issue:
https://lore.kernel.org/all/20250203162511.911946-1-Basavaraj.Natikar@amd.com/

[0] before proxmox-boot-cleanup.service (so pinning with --next-boot
did not help)

Signed-off-by: Stoiko Ivanov <s.ivanov@proxmox.com>
---
 ...-Use-the-MSI-count-and-its-correspon.patch |  31 +++
 ...Utilize-the-AE4DMA-engine-s-multi-qu.patch | 201 ++++++++++++++++++
 2 files changed, 232 insertions(+)
 create mode 100644 patches/kernel/0014-dmaengine-ae4dma-Use-the-MSI-count-and-its-correspon.patch
 create mode 100644 patches/kernel/0015-dmaengine-ptdma-Utilize-the-AE4DMA-engine-s-multi-qu.patch

diff --git a/patches/kernel/0014-dmaengine-ae4dma-Use-the-MSI-count-and-its-correspon.patch b/patches/kernel/0014-dmaengine-ae4dma-Use-the-MSI-count-and-its-correspon.patch
new file mode 100644
index 000000000000..a31676273a98
--- /dev/null
+++ b/patches/kernel/0014-dmaengine-ae4dma-Use-the-MSI-count-and-its-correspon.patch
@@ -0,0 +1,31 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+Date: Mon, 3 Feb 2025 21:55:10 +0530
+Subject: [PATCH] dmaengine: ae4dma: Use the MSI count and its corresponding
+ IRQ number
+
+Instead of using the defined maximum hardware queue, which can lead to
+incorrect values if the counts mismatch, use the exact supported MSI
+count and its corresponding IRQ number.
+
+Fixes: 90a30e268d9b ("dmaengine: ae4dma: Add AMD ae4dma controller driver")
+Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+---
+ drivers/dma/amd/ae4dma/ae4dma-pci.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/drivers/dma/amd/ae4dma/ae4dma-pci.c b/drivers/dma/amd/ae4dma/ae4dma-pci.c
+index aad0dc4294a3945245737978c077eecf740ccb3a..587c5a10c1a8b2dbb925c31af86b1d0b23438b45 100644
+--- a/drivers/dma/amd/ae4dma/ae4dma-pci.c
++++ b/drivers/dma/amd/ae4dma/ae4dma-pci.c
+@@ -46,8 +46,8 @@ static int ae4_get_irqs(struct ae4_device *ae4)
+ 
+ 	} else {
+ 		ae4_msix->msix_count = ret;
+-		for (i = 0; i < MAX_AE4_HW_QUEUES; i++)
+-			ae4->ae4_irq[i] = ae4_msix->msix_entry[i].vector;
++		for (i = 0; i < ae4_msix->msix_count; i++)
++			ae4->ae4_irq[i] = pci_irq_vector(pdev, i);
+ 	}
+ 
+ 	return ret;
diff --git a/patches/kernel/0015-dmaengine-ptdma-Utilize-the-AE4DMA-engine-s-multi-qu.patch b/patches/kernel/0015-dmaengine-ptdma-Utilize-the-AE4DMA-engine-s-multi-qu.patch
new file mode 100644
index 000000000000..c59d69738779
--- /dev/null
+++ b/patches/kernel/0015-dmaengine-ptdma-Utilize-the-AE4DMA-engine-s-multi-qu.patch
@@ -0,0 +1,201 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+Date: Mon, 3 Feb 2025 21:55:11 +0530
+Subject: [PATCH] dmaengine: ptdma: Utilize the AE4DMA engine's multi-queue
+ functionality
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+As AE4DMA offers multi-channel functionality compared to PTDMA’s single
+queue, utilize multi-queue, which supports higher speeds than PTDMA, to
+achieve higher performance using the AE4DMA workqueue based mechanism.
+
+Fixes: 69a47b16a51b ("dmaengine: ptdma: Extend ptdma to support multi-channel and version")
+Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+---
+ drivers/dma/amd/ae4dma/ae4dma.h         |  2 +
+ drivers/dma/amd/ptdma/ptdma-dmaengine.c | 90 ++++++++++++++++++++++++-
+ 2 files changed, 89 insertions(+), 3 deletions(-)
+
+diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
+index 265c5d4360080d6a0cc77f2bab507fde761d5461..57f6048726bb68da03e145d0c69f4bdcd4012c6f 100644
+--- a/drivers/dma/amd/ae4dma/ae4dma.h
++++ b/drivers/dma/amd/ae4dma/ae4dma.h
+@@ -37,6 +37,8 @@
+ #define AE4_DMA_VERSION			4
+ #define CMD_AE4_DESC_DW0_VAL		2
+ 
++#define AE4_TIME_OUT			5000
++
+ struct ae4_msix {
+ 	int msix_count;
+ 	struct msix_entry msix_entry[MAX_AE4_HW_QUEUES];
+diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
+index 35c84ec9608b4fd119972e3cd9abedf818dff743..715ac3ae067b857830db85e170787e30f3ae6b1d 100644
+--- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
++++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
+@@ -198,8 +198,10 @@ static struct pt_dma_desc *pt_handle_active_desc(struct pt_dma_chan *chan,
+ {
+ 	struct dma_async_tx_descriptor *tx_desc;
+ 	struct virt_dma_desc *vd;
++	struct pt_device *pt;
+ 	unsigned long flags;
+ 
++	pt = chan->pt;
+ 	/* Loop over descriptors until one is found with commands */
+ 	do {
+ 		if (desc) {
+@@ -217,7 +219,7 @@ static struct pt_dma_desc *pt_handle_active_desc(struct pt_dma_chan *chan,
+ 
+ 		spin_lock_irqsave(&chan->vc.lock, flags);
+ 
+-		if (desc) {
++		if (pt->ver != AE4_DMA_VERSION && desc) {
+ 			if (desc->status != DMA_COMPLETE) {
+ 				if (desc->status != DMA_ERROR)
+ 					desc->status = DMA_COMPLETE;
+@@ -235,7 +237,7 @@ static struct pt_dma_desc *pt_handle_active_desc(struct pt_dma_chan *chan,
+ 
+ 		spin_unlock_irqrestore(&chan->vc.lock, flags);
+ 
+-		if (tx_desc) {
++		if (pt->ver != AE4_DMA_VERSION && tx_desc) {
+ 			dmaengine_desc_get_callback_invoke(tx_desc, NULL);
+ 			dma_run_dependencies(tx_desc);
+ 			vchan_vdesc_fini(vd);
+@@ -245,11 +247,25 @@ static struct pt_dma_desc *pt_handle_active_desc(struct pt_dma_chan *chan,
+ 	return NULL;
+ }
+ 
++static inline bool ae4_core_queue_full(struct pt_cmd_queue *cmd_q)
++{
++	u32 front_wi = readl(cmd_q->reg_control + AE4_WR_IDX_OFF);
++	u32 rear_ri = readl(cmd_q->reg_control + AE4_RD_IDX_OFF);
++
++	if (((MAX_CMD_QLEN + front_wi - rear_ri) % MAX_CMD_QLEN)  >= (MAX_CMD_QLEN - 1))
++		return true;
++
++	return false;
++}
++
+ static void pt_cmd_callback(void *data, int err)
+ {
+ 	struct pt_dma_desc *desc = data;
++	struct ae4_cmd_queue *ae4cmd_q;
+ 	struct dma_chan *dma_chan;
+ 	struct pt_dma_chan *chan;
++	struct ae4_device *ae4;
++	struct pt_device *pt;
+ 	int ret;
+ 
+ 	if (err == -EINPROGRESS)
+@@ -257,11 +273,32 @@ static void pt_cmd_callback(void *data, int err)
+ 
+ 	dma_chan = desc->vd.tx.chan;
+ 	chan = to_pt_chan(dma_chan);
++	pt = chan->pt;
+ 
+ 	if (err)
+ 		desc->status = DMA_ERROR;
+ 
+ 	while (true) {
++		if (pt->ver == AE4_DMA_VERSION) {
++			ae4 = container_of(pt, struct ae4_device, pt);
++			ae4cmd_q = &ae4->ae4cmd_q[chan->id];
++
++			if (ae4cmd_q->q_cmd_count >= (CMD_Q_LEN - 1) ||
++			    ae4_core_queue_full(&ae4cmd_q->cmd_q)) {
++				wake_up(&ae4cmd_q->q_w);
++
++				if (wait_for_completion_timeout(&ae4cmd_q->cmp,
++								msecs_to_jiffies(AE4_TIME_OUT))
++								== 0) {
++					dev_err(pt->dev, "TIMEOUT %d:\n", ae4cmd_q->id);
++					break;
++				}
++
++				reinit_completion(&ae4cmd_q->cmp);
++				continue;
++			}
++		}
++
+ 		/* Check for DMA descriptor completion */
+ 		desc = pt_handle_active_desc(chan, desc);
+ 
+@@ -296,6 +333,49 @@ static struct pt_dma_desc *pt_alloc_dma_desc(struct pt_dma_chan *chan,
+ 	return desc;
+ }
+ 
++static void pt_cmd_callback_work(void *data, int err)
++{
++	struct dma_async_tx_descriptor *tx_desc;
++	struct pt_dma_desc *desc = data;
++	struct dma_chan *dma_chan;
++	struct virt_dma_desc *vd;
++	struct pt_dma_chan *chan;
++	unsigned long flags;
++
++	dma_chan = desc->vd.tx.chan;
++	chan = to_pt_chan(dma_chan);
++
++	if (err == -EINPROGRESS)
++		return;
++
++	tx_desc = &desc->vd.tx;
++	vd = &desc->vd;
++
++	if (err)
++		desc->status = DMA_ERROR;
++
++	spin_lock_irqsave(&chan->vc.lock, flags);
++	if (desc) {
++		if (desc->status != DMA_COMPLETE) {
++			if (desc->status != DMA_ERROR)
++				desc->status = DMA_COMPLETE;
++
++			dma_cookie_complete(tx_desc);
++			dma_descriptor_unmap(tx_desc);
++		} else {
++			tx_desc = NULL;
++		}
++	}
++	spin_unlock_irqrestore(&chan->vc.lock, flags);
++
++	if (tx_desc) {
++		dmaengine_desc_get_callback_invoke(tx_desc, NULL);
++		dma_run_dependencies(tx_desc);
++		list_del(&desc->vd.node);
++		vchan_vdesc_fini(vd);
++	}
++}
++
+ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
+ 					  dma_addr_t dst,
+ 					  dma_addr_t src,
+@@ -327,6 +407,7 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
+ 	desc->len = len;
+ 
+ 	if (pt->ver == AE4_DMA_VERSION) {
++		pt_cmd->pt_cmd_callback = pt_cmd_callback_work;
+ 		ae4 = container_of(pt, struct ae4_device, pt);
+ 		ae4cmd_q = &ae4->ae4cmd_q[chan->id];
+ 		mutex_lock(&ae4cmd_q->cmd_lock);
+@@ -367,13 +448,16 @@ static void pt_issue_pending(struct dma_chan *dma_chan)
+ {
+ 	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
+ 	struct pt_dma_desc *desc;
++	struct pt_device *pt;
+ 	unsigned long flags;
+ 	bool engine_is_idle = true;
+ 
++	pt = chan->pt;
++
+ 	spin_lock_irqsave(&chan->vc.lock, flags);
+ 
+ 	desc = pt_next_dma_desc(chan);
+-	if (desc)
++	if (desc && pt->ver != AE4_DMA_VERSION)
+ 		engine_is_idle = false;
+ 
+ 	vchan_issue_pending(&chan->vc);
-- 
2.39.5



_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* [pve-devel] applied: [PATCH pve-kernel] cherry-pick fixes for AMD Epyc genua systems
  2025-04-10 13:08 [pve-devel] [PATCH pve-kernel] cherry-pick fixes for AMD Epyc genua systems Stoiko Ivanov
@ 2025-04-10 18:51 ` Thomas Lamprecht
  0 siblings, 0 replies; 2+ messages in thread
From: Thomas Lamprecht @ 2025-04-10 18:51 UTC (permalink / raw)
  To: pve-devel, Stoiko Ivanov

On Thu, 10 Apr 2025 15:08:34 +0200, Stoiko Ivanov wrote:
> both patches are queued for 6.14.2:
> https://lore.kernel.org/all/20250409115934.968141886@linuxfoundation.org/
> issue was reported in our community forum:
> https://forum.proxmox.com/threads/.164497/post-762617
> 
> as we have access to a server where we could reproduce the issue
> (crash+loop, before the system was up[0]) I tested with those 2
> a kernel with those 2 patches applied - and the system booted
> successfully.
> 
> [...]

Applied, thanks!

[1/1] cherry-pick fixes for AMD Epyc genua systems
      commit: 4a6063d2f9565631ad4968517d2b11d3821c1bfe


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2025-04-10 18:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-10 13:08 [pve-devel] [PATCH pve-kernel] cherry-pick fixes for AMD Epyc genua systems Stoiko Ivanov
2025-04-10 18:51 ` [pve-devel] applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal