From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id AE44F9B219 for ; Mon, 20 Nov 2023 10:29:19 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 971A416316 for ; Mon, 20 Nov 2023 10:29:19 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 20 Nov 2023 10:29:18 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 8E524433AD for ; Mon, 20 Nov 2023 10:29:18 +0100 (CET) Message-ID: Date: Mon, 20 Nov 2023 10:29:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: Proxmox VE development discussion , Fiona Ebner References: <20231120091617.109826-1-f.ebner@proxmox.com> Content-Language: en-US From: Friedrich Weber In-Reply-To: <20231120091617.109826-1-f.ebner@proxmox.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.122 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pve-devel] [PATCH qemu] fix #5054: backport fix for software reset with SATA X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Nov 2023 09:29:19 -0000 Tested with an OPNsense VM: With pve-qemu-kvm 8.1.2-2, it did not boot from SATA ("Root mount waiting for: CAM"). virtio worked though. With the patched pve-qemu-kvm package I got from Fiona, the VM booted from SATA again. virtio still works too. Tested-by: Friedrich Weber On 20/11/2023 10:16, Fiona Ebner wrote: > The issue prevented FreeBSD 14 VMs with SATA disk from booting. > > The commit it fixes e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document > PxCI handling") is part of stable 8.1.2. > > The patch was already applied to the block branch upstream: > https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02711.html > > Signed-off-by: Fiona Ebner > --- > ...w-ide-ahci-fix-legacy-software-reset.patch | 107 ++++++++++++++++++ > debian/patches/series | 1 + > 2 files changed, 108 insertions(+) > create mode 100644 debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch > > diff --git a/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch > new file mode 100644 > index 0000000..f070818 > --- /dev/null > +++ b/debian/patches/extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch > @@ -0,0 +1,107 @@ > +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 > +From: Niklas Cassel > +Date: Wed, 8 Nov 2023 23:26:57 +0100 > +Subject: [PATCH] hw/ide/ahci: fix legacy software reset > + > +Legacy software contains a standard mechanism for generating a reset to a > +Serial ATA device - setting the SRST (software reset) bit in the Device > +Control register. > + > +Serial ATA has a more robust mechanism called COMRESET, also referred to > +as port reset. A port reset is the preferred mechanism for error > +recovery and should be used in place of software reset. > + > +Commit e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling") > +improved the handling of PxCI, such that PxCI gets cleared after handling > +a non-NCQ, or NCQ command (instead of incorrectly clearing PxCI after > +receiving anything - even a FIS that failed to parse, which should NOT > +clear PxCI, so that you can see which command slot that caused an error). > + > +However, simply clearing PxCI after a non-NCQ, or NCQ command, is not > +enough, we also need to clear PxCI when receiving a SRST in the Device > +Control register. > + > +A legacy software reset is performed by the host sending two H2D FISes, > +the first H2D FIS asserts SRST, and the second H2D FIS deasserts SRST. > + > +The first H2D FIS will not get a D2H reply, and requires the FIS to have > +the C bit set to one, such that the HBA itself will clear the bit in PxCI. > + > +The second H2D FIS will get a D2H reply once the diagnostic is completed. > +The clearing of the bit in PxCI for this command should ideally be done > +in ahci_init_d2h() (if it was a legacy software reset that caused the > +reset (a COMRESET does not use a command slot)). However, since the reset > +value for PxCI is 0, modify ahci_reset_port() to actually clear PxCI to 0, > +that way we can avoid complex logic in ahci_init_d2h(). > + > +This fixes an issue for FreeBSD where the device would fail to reset. > +The problem was not noticed in Linux, because Linux uses a COMRESET > +instead of a legacy software reset by default. > + > +Fixes: e2a5d9b3d9c3 ("hw/ide/ahci: simplify and document PxCI handling") > +Reported-by: Marcin Juszkiewicz > +Signed-off-by: Niklas Cassel > +(picked from https://lists.nongnu.org/archive/html/qemu-devel/2023-11/msg02277.html) > +Signed-off-by: Fiona Ebner > +--- > + hw/ide/ahci.c | 27 ++++++++++++++++++++++++++- > + 1 file changed, 26 insertions(+), 1 deletion(-) > + > +diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c > +index d0a774bc17..1718b7e902 100644 > +--- a/hw/ide/ahci.c > ++++ b/hw/ide/ahci.c > +@@ -623,9 +623,13 @@ static void ahci_init_d2h(AHCIDevice *ad) > + return; > + } > + > ++ /* > ++ * For simplicity, do not call ahci_clear_cmd_issue() for this > ++ * ahci_write_fis_d2h(). (The reset value for PxCI is 0.) > ++ */ > + if (ahci_write_fis_d2h(ad, true)) { > + ad->init_d2h_sent = true; > +- /* We're emulating receiving the first Reg H2D Fis from the device; > ++ /* We're emulating receiving the first Reg D2H FIS from the device; > + * Update the SIG register, but otherwise proceed as normal. */ > + pr->sig = ((uint32_t)ide_state->hcyl << 24) | > + (ide_state->lcyl << 16) | > +@@ -663,6 +667,7 @@ static void ahci_reset_port(AHCIState *s, int port) > + pr->scr_act = 0; > + pr->tfdata = 0x7F; > + pr->sig = 0xFFFFFFFF; > ++ pr->cmd_issue = 0; > + d->busy_slot = -1; > + d->init_d2h_sent = false; > + > +@@ -1243,10 +1248,30 @@ static void handle_reg_h2d_fis(AHCIState *s, int port, > + case STATE_RUN: > + if (cmd_fis[15] & ATA_SRST) { > + s->dev[port].port_state = STATE_RESET; > ++ /* > ++ * When setting SRST in the first H2D FIS in the reset sequence, > ++ * the device does not send a D2H FIS. Host software thus has to > ++ * set the "Clear Busy upon R_OK" bit such that PxCI (and BUSY) > ++ * gets cleared. See AHCI 1.3.1, section 10.4.1 Software Reset. > ++ */ > ++ if (opts & AHCI_CMD_CLR_BUSY) { > ++ ahci_clear_cmd_issue(ad, slot); > ++ } > + } > + break; > + case STATE_RESET: > + if (!(cmd_fis[15] & ATA_SRST)) { > ++ /* > ++ * When clearing SRST in the second H2D FIS in the reset > ++ * sequence, the device will execute diagnostics. When this is > ++ * done, the device will send a D2H FIS with the good status. > ++ * See SATA 3.5a Gold, section 11.4 Software reset protocol. > ++ * > ++ * This D2H FIS is the first D2H FIS received from the device, > ++ * and is received regardless if the reset was performed by a > ++ * COMRESET or by setting and clearing the SRST bit. Therefore, > ++ * the logic for this is found in ahci_init_d2h() and not here. > ++ */ > + ahci_reset_port(s, port); > + } > + break; > diff --git a/debian/patches/series b/debian/patches/series > index ad84088..992299c 100644 > --- a/debian/patches/series > +++ b/debian/patches/series > @@ -6,6 +6,7 @@ extra/0005-hw-ide-reset-cancel-async-DMA-operation-before-reset.patch > extra/0006-Revert-Revert-graph-lock-Disable-locking-for-now.patch > extra/0007-migration-states-workaround-snapshot-performance-reg.patch > extra/0008-Revert-x86-acpi-workaround-Windows-not-handling-name.patch > +extra/0009-hw-ide-ahci-fix-legacy-software-reset.patch > bitmap-mirror/0001-drive-mirror-add-support-for-sync-bitmap-mode-never.patch > bitmap-mirror/0002-drive-mirror-add-support-for-conditional-and-always-.patch > bitmap-mirror/0003-mirror-add-check-for-bitmap-mode-without-bitmap.patch