public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA
@ 2023-11-20  9:16 Fiona Ebner
  2023-11-20  9:29 ` Friedrich Weber
  2023-11-20  9:42 ` [pve-devel] applied: " Thomas Lamprecht
  0 siblings, 2 replies; 3+ messages in thread
From: Fiona Ebner @ 2023-11-20  9:16 UTC (permalink / raw)
  To: pve-devel

The issue prevented FreeBSD 14 VMs with SATA disk from booting.

The commit it fixes e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document
PxCI handling") is part of stable 8.1.2.

The patch was already applied to the block branch upstream:
https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02711.html

Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 ...w-ide-ahci-fix-legacy-software-reset.patch | 107 ++++++++++++++++++
 debian/patches/series                         |   1 +
 2 files changed, 108 insertions(+)
 create mode 100644 debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch

diff --git a/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
new file mode 100644
index 0000000..f070818
--- /dev/null
+++ b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
@@ -0,0 +1,107 @@
+From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
+From: Niklas Cassel <niklas.cassel@wdc.com>
+Date: Wed, 8 Nov 2023 23:26:57 +0100
+Subject: [PATCH] hw/ide/ahci: fix legacy software reset
+
+Legacy software contains a standard mechanism for generating a reset to a
+Serial ATA device - setting the SRST (software reset) bit in the Device
+Control register.
+
+Serial ATA has a more robust mechanism called COMRESET, also referred to
+as port reset. A port reset is the preferred mechanism for error
+recovery and should be used in place of software reset.
+
+Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
+improved the handling of PxCI, such that PxCI gets cleared after handling
+a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
+receiving anything - even a FIS that failed to parse, which should NOT
+clear PxCI, so that you can see which command slot that caused an error).
+
+However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
+enough, we also need to clear PxCI when receiving a SRST in the Device
+Control register.
+
+A legacy software reset is performed by the host sending two H2D FISes,
+the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
+
+The first H2D FIS will not get a D2H reply, and requires the FIS to have
+the C bit set to one, such that the HBA itself will clear the bit in PxCI.
+
+The second H2D FIS will get a D2H reply once the diagnostic is completed.
+The clearing of the bit in PxCI for this command should ideally be done
+in ahci_init_d2h() (if it was a legacy software reset that caused the
+reset (a COMRESET does not use a command slot)). However, since the reset
+value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
+that way we can avoid complex logic in ahci_init_d2h().
+
+This fixes an issue for FreeBSD where the device would fail to reset.
+The problem was not noticed in Linux, because Linux uses a COMRESET
+instead of a legacy software reset by default.
+
+Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
+Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
+Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
+(picked from https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02277.html)
+Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
+---
+ hw/ide/ahci.c | 27 ++++++++++++++++++++++++++-
+ 1 file changed, 26 insertions(+), 1 deletion(-)
+
+diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
+index d0a774bc17..1718b7e902 100644
+--- a/hw/ide/ahci.c
++++ b/hw/ide/ahci.c
+@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad)
+         return;
+     }
+ 
++    /*
++     * For simplicity, do not call ahci_clear_cmd_issue() for this
++     * ahci_write_fis_d2h(). (The reset value for PxCI is 0.)
++     */
+     if (ahci_write_fis_d2h(ad, true)) {
+         ad->init_d2h_sent = true;
+-        /* We're emulating receiving the first Reg H2D Fis from the device;
++        /* We're emulating receiving the first Reg D2H FIS from the device;
+          * Update the SIG register, but otherwise proceed as normal. */
+         pr->sig = ((uint32_t)ide_state->hcyl << 24) |
+             (ide_state->lcyl << 16) |
+@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port)
+     pr->scr_act = 0;
+     pr->tfdata = 0x7F;
+     pr->sig = 0xFFFFFFFF;
++    pr->cmd_issue = 0;
+     d->busy_slot = -1;
+     d->init_d2h_sent = false;
+ 
+@@ -1243,10 +1248,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
+         case STATE_RUN:
+             if (cmd_fis[15] & ATA_SRST) {
+                 s->dev[port].port_state = STATE_RESET;
++                /*
++                 * When setting SRST in the first H2D FIS in the reset sequence,
++                 * the device does not send a D2H FIS. Host software thus has to
++                 * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
++                 * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
++                 */
++                if (opts & AHCI_CMD_CLR_BUSY) {
++                    ahci_clear_cmd_issue(ad, slot);
++                }
+             }
+             break;
+         case STATE_RESET:
+             if (!(cmd_fis[15] & ATA_SRST)) {
++                /*
++                 * When clearing SRST in the second H2D FIS in the reset
++                 * sequence, the device will execute diagnostics. When this is
++                 * done, the device will send a D2H FIS with the good status.
++                 * See SATA 3.5a Gold, section 11.4 Software reset protocol.
++                 *
++                 * This D2H FIS is the first D2H FIS received from the device,
++                 * and is received regardless if the reset was performed by a
++                 * COMRESET or by setting and clearing the SRST bit. Therefore,
++                 * the logic for this is found in ahci_init_d2h() and not here.
++                 */
+                 ahci_reset_port(s, port);
+             }
+             break;
diff --git a/debian/patches/series b/debian/patches/series
index ad84088..992299c 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -6,6 +6,7 @@ extra/0005-hw-ide-reset-cancel-async-DMA-operation-before-reset.patch
 extra/0006-Revert-Revert-graph-lock-Disable-locking-for-now.patch
 extra/0007-migration-states-workaround-snapshot-performance-reg.patch
 extra/0008-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
+extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
 bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
 bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
 bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch
-- 
2.39.2





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA
  2023-11-20  9:16 [pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA Fiona Ebner
@ 2023-11-20  9:29 ` Friedrich Weber
  2023-11-20  9:42 ` [pve-devel] applied: " Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Friedrich Weber @ 2023-11-20  9:29 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Tested with an OPNsense VM: With pve-qemu-kvm 8.1.2-2, it did not boot
from SATA ("Root mount waiting for: CAM"). virtio worked though.

With the patched pve-qemu-kvm package I got from Fiona, the VM booted
from SATA again. virtio still works too.

Tested-by: Friedrich Weber <f.weber@proxmox.com>

On 20/11/2023 10:16, Fiona Ebner wrote:
> The issue prevented FreeBSD 14 VMs with SATA disk from booting.
> 
> The commit it fixes e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document
> PxCI handling") is part of stable 8.1.2.
> 
> The patch was already applied to the block branch upstream:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02711.html
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  ...w-ide-ahci-fix-legacy-software-reset.patch | 107 ++++++++++++++++++
>  debian/patches/series                         |   1 +
>  2 files changed, 108 insertions(+)
>  create mode 100644 debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> 
> diff --git a/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> new file mode 100644
> index 0000000..f070818
> --- /dev/null
> +++ b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> @@ -0,0 +1,107 @@
> +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
> +From: Niklas Cassel <niklas.cassel@wdc.com>
> +Date: Wed, 8 Nov 2023 23:26:57 +0100
> +Subject: [PATCH] hw/ide/ahci: fix legacy software reset
> +
> +Legacy software contains a standard mechanism for generating a reset to a
> +Serial ATA device - setting the SRST (software reset) bit in the Device
> +Control register.
> +
> +Serial ATA has a more robust mechanism called COMRESET, also referred to
> +as port reset. A port reset is the preferred mechanism for error
> +recovery and should be used in place of software reset.
> +
> +Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> +improved the handling of PxCI, such that PxCI gets cleared after handling
> +a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after
> +receiving anything - even a FIS that failed to parse, which should NOT
> +clear PxCI, so that you can see which command slot that caused an error).
> +
> +However, simply clearing PxCI after a non-NCQ, or NCQ command, is not
> +enough, we also need to clear PxCI when receiving a SRST in the Device
> +Control register.
> +
> +A legacy software reset is performed by the host sending two H2D FISes,
> +the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST.
> +
> +The first H2D FIS will not get a D2H reply, and requires the FIS to have
> +the C bit set to one, such that the HBA itself will clear the bit in PxCI.
> +
> +The second H2D FIS will get a D2H reply once the diagnostic is completed.
> +The clearing of the bit in PxCI for this command should ideally be done
> +in ahci_init_d2h() (if it was a legacy software reset that caused the
> +reset (a COMRESET does not use a command slot)). However, since the reset
> +value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0,
> +that way we can avoid complex logic in ahci_init_d2h().
> +
> +This fixes an issue for FreeBSD where the device would fail to reset.
> +The problem was not noticed in Linux, because Linux uses a COMRESET
> +instead of a legacy software reset by default.
> +
> +Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling")
> +Reported-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
> +Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
> +(picked from https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02277.html)
> +Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> +---
> + hw/ide/ahci.c | 27 ++++++++++++++++++++++++++-
> + 1 file changed, 26 insertions(+), 1 deletion(-)
> +
> +diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> +index d0a774bc17..1718b7e902 100644
> +--- a/hw/ide/ahci.c
> ++++ b/hw/ide/ahci.c
> +@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad)
> +         return;
> +     }
> + 
> ++    /*
> ++     * For simplicity, do not call ahci_clear_cmd_issue() for this
> ++     * ahci_write_fis_d2h(). (The reset value for PxCI is 0.)
> ++     */
> +     if (ahci_write_fis_d2h(ad, true)) {
> +         ad->init_d2h_sent = true;
> +-        /* We're emulating receiving the first Reg H2D Fis from the device;
> ++        /* We're emulating receiving the first Reg D2H FIS from the device;
> +          * Update the SIG register, but otherwise proceed as normal. */
> +         pr->sig = ((uint32_t)ide_state->hcyl << 24) |
> +             (ide_state->lcyl << 16) |
> +@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port)
> +     pr->scr_act = 0;
> +     pr->tfdata = 0x7F;
> +     pr->sig = 0xFFFFFFFF;
> ++    pr->cmd_issue = 0;
> +     d->busy_slot = -1;
> +     d->init_d2h_sent = false;
> + 
> +@@ -1243,10 +1248,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
> +         case STATE_RUN:
> +             if (cmd_fis[15] & ATA_SRST) {
> +                 s->dev[port].port_state = STATE_RESET;
> ++                /*
> ++                 * When setting SRST in the first H2D FIS in the reset sequence,
> ++                 * the device does not send a D2H FIS. Host software thus has to
> ++                 * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY)
> ++                 * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset.
> ++                 */
> ++                if (opts & AHCI_CMD_CLR_BUSY) {
> ++                    ahci_clear_cmd_issue(ad, slot);
> ++                }
> +             }
> +             break;
> +         case STATE_RESET:
> +             if (!(cmd_fis[15] & ATA_SRST)) {
> ++                /*
> ++                 * When clearing SRST in the second H2D FIS in the reset
> ++                 * sequence, the device will execute diagnostics. When this is
> ++                 * done, the device will send a D2H FIS with the good status.
> ++                 * See SATA 3.5a Gold, section 11.4 Software reset protocol.
> ++                 *
> ++                 * This D2H FIS is the first D2H FIS received from the device,
> ++                 * and is received regardless if the reset was performed by a
> ++                 * COMRESET or by setting and clearing the SRST bit. Therefore,
> ++                 * the logic for this is found in ahci_init_d2h() and not here.
> ++                 */
> +                 ahci_reset_port(s, port);
> +             }
> +             break;
> diff --git a/debian/patches/series b/debian/patches/series
> index ad84088..992299c 100644
> --- a/debian/patches/series
> +++ b/debian/patches/series
> @@ -6,6 +6,7 @@ extra/0005-hw-ide-reset-cancel-async-DMA-operation-before-reset.patch
>  extra/0006-Revert-Revert-graph-lock-Disable-locking-for-now.patch
>  extra/0007-migration-states-workaround-snapshot-performance-reg.patch
>  extra/0008-Revert-x86-acpi-workaround-Windows-not-handling-name.patch
> +extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
>  bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch
>  bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch
>  bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch




^ permalink raw reply	[flat|nested] 3+ messages in thread

* [pve-devel] applied: [PATCH qemu] fix #5054: backport fix for software reset with SATA
  2023-11-20  9:16 [pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA Fiona Ebner
  2023-11-20  9:29 ` Friedrich Weber
@ 2023-11-20  9:42 ` Thomas Lamprecht
  1 sibling, 0 replies; 3+ messages in thread
From: Thomas Lamprecht @ 2023-11-20  9:42 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner

Am 20/11/2023 um 10:16 schrieb Fiona Ebner:
> The issue prevented FreeBSD 14 VMs with SATA disk from booting.
> 
> The commit it fixes e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document
> PxCI handling") is part of stable 8.1.2.
> 
> The patch was already applied to the block branch upstream:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02711.html
> 
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  ...w-ide-ahci-fix-legacy-software-reset.patch | 107 ++++++++++++++++++
>  debian/patches/series                         |   1 +
>  2 files changed, 108 insertions(+)
>  create mode 100644 debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch
> 
>

applied, with Friedrich's T-b, thanks!




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-11-20  9:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-20  9:16 [pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA Fiona Ebner
2023-11-20  9:29 ` Friedrich Weber
2023-11-20  9:42 ` [pve-devel] applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal