From: "Knight, Joshua via pve-user" <pve-user@lists.proxmox.com>
To: Fiona Ebner <f.ebner@proxmox.com>,
Proxmox VE user list <pve-user@lists.proxmox.com>
Cc: "Knight, Joshua" <Joshua.Knight@netscout.com>
Subject: Re: [PVE-User] QEMU crash with dpdk 22.11 app on Proxmox 8
Date: Wed, 4 Sep 2024 14:49:50 +0000 [thread overview]
Message-ID: <mailman.45.1725464018.414.pve-user@lists.proxmox.com> (raw)
In-Reply-To: <8f50a6ec-5612-4522-a826-2054e4a7d06e@proxmox.com>
[-- Attachment #1: Type: message/rfc822, Size: 28566 bytes --]
From: "Knight, Joshua" <Joshua.Knight@netscout.com>
To: Fiona Ebner <f.ebner@proxmox.com>, Proxmox VE user list <pve-user@lists.proxmox.com>
Subject: Re: [PVE-User] QEMU crash with dpdk 22.11 app on Proxmox 8
Date: Wed, 4 Sep 2024 14:49:50 +0000
Message-ID: <PH7PR01MB84968F27787A0BF6B885EEF9879C2@PH7PR01MB8496.prod.exchangelabs.com>
Thank you for the response and explanation. Would you like me to file a Bugzilla entry for this? Or is there an existing bug ID already that could be used to track the issue?
Thanks,
Josh
From: Fiona Ebner <f.ebner@proxmox.com>
Date: Wednesday, September 4, 2024 at 5:59 AM
To: Proxmox VE user list <pve-user@lists.proxmox.com>
Cc: Knight, Joshua <Joshua.Knight@netscout.com>
Subject: Re: [PVE-User] QEMU crash with dpdk 22.11 app on Proxmox 8
External Email: This message originated outside of NETSCOUT. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi,
Am 28.08.24 um 16:56 schrieb Knight, Joshua via pve-user:
>
>
> We are seeing an issue on Proxmox 8 hosts where the underlying QEMU process for a guest will crash while starting a DPDK application in the guest.
>
>
> * Proxmox 8.2.4 with QEMU 9.0.2-2
> * Guest running Ubuntu 22.04, application is dpdk 22.11 testpmd
> * Using virtio network interfaces that are up/connected
> * Binding interfaces with the (legacy) igb_uio driver
>
> When starting the application, the VM ssh connection will disconnect and the VM will be powered off in the ui.
>
> root@karma06:~/dpdk-22.11# python3 /root/dpdk-22.11/res/usr/local/bin/dpdk-devbind.py --bind=igb_uio enp6s20
> root@karma06:~/dpdk-22.11# python3 /root/dpdk-22.11/res/usr/local/bin/dpdk-devbind.py --bind=igb_uio enp6s21
> root@karma06:~/dpdk-22.11# python3 /root/dpdk-22.11/res/usr/local/bin/dpdk-devbind.py --bind=igb_uio enp6s22
> root@karma06:~/dpdk-22.11# python3 /root/dpdk-22.11/res/usr/local/bin/dpdk-devbind.py --bind=igb_uio enp6s23
>
> root@karma06:~/dpdk-22.11# /root/dpdk-22.11/res/usr/local/bin/dpdk-testpmd -- -i --port-topology=chained --rxq=1 --txq=1 --rss-ip
> EAL: Detected CPU lcores: 6
> EAL: Detected NUMA nodes: 1
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: VFIO support initialized
> EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:12.0 (socket -1)
> eth_virtio_pci_init(): Failed to init PCI device
> EAL: Requested device 0000:06:12.0 cannot be used
> EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:13.0 (socket -1)
> eth_virtio_pci_init(): Failed to init PCI device
> EAL: Requested device 0000:06:13.0 cannot be used
> EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:14.0 (socket -1)
> EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:15.0 (socket -1)
> EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:16.0 (socket -1)
> EAL: Probe PCI driver: net_virtio (1af4:1000) device: 0000:06:17.0 (socket -1)
> TELEMETRY: No legacy callbacks, legacy socket not created
> Interactive-mode selected
> Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
> testpmd: create a new mbuf pool <mb_pool_0>: n=187456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
> Configuring Port 0 (socket 0)
>
> client_loop: send disconnect: Broken pipe
>
>
>
> A QEMU assertion is seen in the host’s system log. Using GDB we can see that QEMU is aborted.
>
> karma QEMU[27334]: kvm: ../accel/kvm/kvm-all.c:1836: kvm_irqchip_commit_routes: Assertion `ret == 0' failed.
>
> Thread 10 "CPU 0/KVM" received signal SIGABRT, Aborted.
> [Switching to Thread 0x7d999cc006c0 (LWP 36256)]
> __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
> 44 ./nptl/pthread_kill.c: No such file or directory.
> (gdb) bt
> #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
> #1 0x00007d99a10a9e8f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
> #2 0x00007d99a105afb2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
> #3 0x00007d99a1045472 in __GI_abort () at ./stdlib/abort.c:79
> #4 0x00007d99a1045395 in __assert_fail_base (fmt=0x7d99a11b9a90 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
> assertion=assertion@entry=0x5a9eb5a20f5e "ret == 0", file=file@entry=0x5a9eb5a021a5 "../accel/kvm/kvm-all.c", line=line@entry=1836,
> function=function@entry=0x5a9eb5a03ca0 <__PRETTY_FUNCTION__.23> "kvm_irqchip_commit_routes") at ./assert/assert.c:92
> #5 0x00007d99a1053eb2 in __GI___assert_fail (assertion=assertion@entry=0x5a9eb5a20f5e "ret == 0",
> file=file@entry=0x5a9eb5a021a5 "../accel/kvm/kvm-all.c", line=line@entry=1836,
> function=function@entry=0x5a9eb5a03ca0 <__PRETTY_FUNCTION__.23> "kvm_irqchip_commit_routes") at ./assert/assert.c:101
> #6 0x00005a9eb566248c in kvm_irqchip_commit_routes (s=0x5a9eb79eed10) at ../accel/kvm/kvm-all.c:1836
> #7 kvm_irqchip_commit_routes (s=0x5a9eb79eed10) at ../accel/kvm/kvm-all.c:1821
> #8 0x00005a9eb540bed2 in virtio_pci_one_vector_unmask (proxy=proxy@entry=0x5a9eb9f5ada0, queue_no=queue_no@entry=4294967295,
> vector=vector@entry=0, msg=..., n=0x5a9eb9f63368) at ../hw/virtio/virtio-pci.c:991
> #9 0x00005a9eb540c09c in virtio_pci_vector_unmask (dev=0x5a9eb9f5ada0, vector=0, msg=...) at ../hw/virtio/virtio-pci.c:1056
> #10 0x00005a9eb536ff62 in msix_fire_vector_notifier (is_masked=false, vector=0, dev=0x5a9eb9f5ada0) at ../hw/pci/msix.c:120
> #11 msix_handle_mask_update (dev=0x5a9eb9f5ada0, vector=0, was_masked=<optimized out>) at ../hw/pci/msix.c:140
> #12 0x00005a9eb5602260 in memory_region_write_accessor (mr=0x5a9eb9f5b3e0, addr=12, value=<optimized out>, size=4, shift=<optimized out>,
> mask=<optimized out>, attrs=...) at ../system/memory.c:497
> #13 0x00005a9eb5602f4e in access_with_adjusted_size (addr=addr@entry=12, value=value@entry=0x7d999cbfae58, size=size@entry=4,
> access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x5a9eb56021e0 <memory_region_write_accessor>,
> mr=<optimized out>, attrs=...) at ../system/memory.c:573
> #14 0x00005a9eb560403c in memory_region_dispatch_write (mr=mr@entry=0x5a9eb9f5b3e0, addr=addr@entry=12, data=<optimized out>,
> op=<optimized out>, attrs=attrs@entry=...) at ../system/memory.c:1528
> #15 0x00005a9eb560b95f in flatview_write_continue_step (attrs=attrs@entry=..., buf=buf@entry=0x7d99a3433028 "", mr_addr=12,
> l=l@entry=0x7d999cbfaf80, mr=0x5a9eb9f5b3e0, len=4) at ../system/physmem.c:2713
> #16 0x00005a9eb560bbed in flatview_write_continue (mr=<optimized out>, l=<optimized out>, mr_addr=<optimized out>, len=4, ptr=0xfdf8500c,
> attrs=..., addr=4260909068, fv=0x7d8d6c0796b0) at ../system/physmem.c:2743
> #17 flatview_write (fv=0x7d8d6c0796b0, addr=addr@entry=4260909068, attrs=attrs@entry=..., buf=buf@entry=0x7d99a3433028, len=len@entry=4)
> at ../system/physmem.c:2774
> #18 0x00005a9eb560f251 in address_space_write (len=4, buf=0x7d99a3433028, attrs=..., addr=4260909068, as=0x5a9eb66f1f20 <address_space_memory>)
> at ../system/physmem.c:2894
> #19 address_space_rw (as=0x5a9eb66f1f20 <address_space_memory>, addr=4260909068, attrs=attrs@entry=..., buf=buf@entry=0x7d99a3433028, len=4,
> is_write=<optimized out>) at ../system/physmem.c:2904
> #20 0x00005a9eb56660e8 in kvm_cpu_exec (cpu=cpu@entry=0x5a9eb81e6890) at ../accel/kvm/kvm-all.c:2917
> #21 0x00005a9eb56676d5 in kvm_vcpu_thread_fn (arg=arg@entry=0x5a9eb81e6890) at ../accel/kvm/kvm-accel-ops.c:50
> #22 0x00005a9eb581dfe8 in qemu_thread_start (args=0x5a9eb81ee390) at ../util/qemu-thread-posix.c:541
> #23 0x00007d99a10a8134 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
> #24 0x00007d99a11287dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>
>
> One thing that’s interesting about this backtrace is it seems to exactly match an existing issue in QEMU that claims to be patched, and that patch should be present in QEMU 9.0.2, the version running on this Proxmox host.
>
> https://urldefense.com/v3/__https://gitlab.com/qemu-project/qemu/-/issues/1928__;!!Nzg7nt7_!DDyzn2spNViFXSmqxBuK7TasCnLmJNaWW2Mqdm1FXd2DPjls_iN6d4QRHEnsnwc_7Atv-hrey5DbkrHuZ366b3Q$<https://urldefense.com/v3/__https:/gitlab.com/qemu-project/qemu/-/issues/1928__;!!Nzg7nt7_!DDyzn2spNViFXSmqxBuK7TasCnLmJNaWW2Mqdm1FXd2DPjls_iN6d4QRHEnsnwc_7Atv-hrey5DbkrHuZ366b3Q$>
>
> We’ve found a workaround by switching from the deprecated igb_uio driver to the vfio-pci driver when binding the interfaces for dpdk. In this case the VM does not crash. But I’m wondering if anyone has hit this before or if it’s a known issue. I would certainly not expect any operation in the guest to cause QEMU to crash. It’s also odd that the crash seen claims to be patched in 9.0.2.
>
> We’ve been able to reproduce this on Proxmox 8.0, 8.1, 8.2 on both AMD and Intel processors. The crash does not occur on earlier releases such as Proxmox 6.4, and does not occur with earlier dpdk versions such as 20.08.
>
> Thanks,
> Josh
>
we do have a revert of that patch currently, because it caused some
regressions that sounded just as bad as the original issue [0].
A fix for the regressions has landed upstream now [1], and I'll take a
look at pulling it in and dropping the revert.
[0]:
https://urldefense.com/v3/__https://git.proxmox.com/?p=pve-qemu.git;a=blob;f=debian*patches*extra*0006-Revert-virtio-pci-fix-use-of-a-released-vector.patch;h=d2de6d11ba1e2a2bd2ea8dccf660ac6e66b047d4;hb=582fd47901356342b8e0bef19d7d8fdc324d2d96__;Ly8v!!Nzg7nt7_!DDyzn2spNViFXSmqxBuK7TasCnLmJNaWW2Mqdm1FXd2DPjls_iN6d4QRHEnsnwc_7Atv-hrey5DbkrHueDRcgcQ$<https://urldefense.com/v3/__https:/git.proxmox.com/?p=pve-qemu.git;a=blob;f=debian*patches*extra*0006-Revert-virtio-pci-fix-use-of-a-released-vector.patch;h=d2de6d11ba1e2a2bd2ea8dccf660ac6e66b047d4;hb=582fd47901356342b8e0bef19d7d8fdc324d2d96__;Ly8v!!Nzg7nt7_!DDyzn2spNViFXSmqxBuK7TasCnLmJNaWW2Mqdm1FXd2DPjls_iN6d4QRHEnsnwc_7Atv-hrey5DbkrHueDRcgcQ$>
[1]:
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/a8e63ff289d137197ad7a701a587cc432872d798.1724151593.git.mst@redhat.com/__;!!Nzg7nt7_!DDyzn2spNViFXSmqxBuK7TasCnLmJNaWW2Mqdm1FXd2DPjls_iN6d4QRHEnsnwc_7Atv-hrey5DbkrHulPlHOF4$<https://urldefense.com/v3/__https:/lore.kernel.org/qemu-devel/a8e63ff289d137197ad7a701a587cc432872d798.1724151593.git.mst@redhat.com/__;!!Nzg7nt7_!DDyzn2spNViFXSmqxBuK7TasCnLmJNaWW2Mqdm1FXd2DPjls_iN6d4QRHEnsnwc_7Atv-hrey5DbkrHulPlHOF4$>
Best Regards,
Fiona
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
next prev parent reply other threads:[~2024-09-04 15:33 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-28 14:56 Knight, Joshua via pve-user
2024-09-04 9:58 ` Fiona Ebner
2024-09-04 14:49 ` Knight, Joshua via pve-user [this message]
[not found] ` <PH7PR01MB84968F27787A0BF6B885EEF9879C2@PH7PR01MB8496.prod.exchangelabs.com>
2024-09-05 9:53 ` Fiona Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mailman.45.1725464018.414.pve-user@lists.proxmox.com \
--to=pve-user@lists.proxmox.com \
--cc=Joshua.Knight@netscout.com \
--cc=f.ebner@proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox