public inbox for pve-user@lists.proxmox.com
 help / color / mirror / Atom feed
From: Gilles Pietri <contact+dev@gilouweb.com>
To: "proxmoxve (pve-user@pve.proxmox.com)" <pve-user@pve.proxmox.com>
Subject: [PVE-User] Nested KVM on AMD EPYC processor, oops
Date: Tue, 15 Jun 2021 21:34:42 +0200	[thread overview]
Message-ID: <38cc69ca-b0e9-6ee1-afb6-86a19c0db098@gilouweb.com> (raw)

Hi,

I'm running qemu (through openstack) on a proxmox instance running:
# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.119-1-pve)
pve-manager: 6.4-8 (running version: 6.4-8/185e14db)
pve-kernel-5.4: 6.4-3
pve-kernel-helper: 6.4-3
[...]
pve-qemu-kvm: 5.2.0-6
qemu-server: 6.4-2


The qemu version in Openstack (Wallaby) is
$ qemu-system-x86_64 -version
QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.16)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

The VM is running using the host CPU, which is AMD EPYC 7451 24-Core
with relevant parameters here:
cores: 8
cpu: host
memory: 32768
numa: 0
rng0: source=/dev/urandom
sockets: 1


And I get spammed a lot with that kind of traces:

[Tue Jun 15 19:13:29 2021] ------------[ cut here ]------------
[Tue Jun 15 19:13:29 2021] WARNING: CPU: 6 PID: 47530 at
arch/x86/kvm/mmu.c:2250 nonpaging_update_pte+0x9/0x10 [kvm]
[Tue Jun 15 19:13:30 2021] Modules linked in: xt_nat
nf_conntrack_netlink tcp_diag inet_diag xt_MASQUERADE xfrm_user
iptable_nat nf_nat overlay binfmt_misc rpcsec_gss_krb5 auth_rpcgss nfsv4
nfs lockd grace fscache sctp veth ebt_arp ebtable_filter ebtables
ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables
iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev
xt_addrtype xt_comment xt_multiport xt_conntrack nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip_set_hash_net ip_set
iptable_filter bpfilter softdog nfnetlink_log nfnetlink ipmi_ssif
amd64_edac_mod edac_mce_amd kvm_amd kvm drm_vram_helper irqbypass ttm
crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel drm
aesni_intel i2c_algo_bit crypto_simd fb_sys_fops cryptd syscopyarea
sysfillrect glue_helper k10temp sysimgblt ccp ipmi_si ipmi_devintf
ipmi_msghandler mac_hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO)
icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser
rdma_cm iw_cm ib_cm
[Tue Jun 15 19:13:30 2021]  ib_core iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 dm_thin_pool
dm_persistent_data dm_bio_prison dm_bufio raid10 raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
libcrc32c raid0 multipath linear ixgbe ahci xhci_pci xfrm_algo raid1
i2c_piix4 dca libahci xhci_hcd mdio
[Tue Jun 15 19:13:30 2021] CPU: 6 PID: 47530 Comm: kvm Tainted: P
W  O      5.4.119-1-pve #1
[Tue Jun 15 19:13:30 2021] Hardware name: empty empty/S8026GM2NRE-HOV-B,
BIOS V8.711 07/09/2020
[Tue Jun 15 19:13:30 2021] RIP: 0010:nonpaging_update_pte+0x9/0x10 [kvm]
[Tue Jun 15 19:13:30 2021] Code: 00 0f 1f 44 00 00 55 31 c0 48 89 e5 5d
c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 5d c3 0f 1f 44 00 00 0f 1f 44 00
00 55 48 89 e5 <0f> 0b 5d c3 0f 1f 00 0f 1f 44 00 00 31 f6 48 8b 04 77
48 63 54 37
[Tue Jun 15 19:13:30 2021] RSP: 0018:ffffb904e4c7ba78 EFLAGS: 00010202
[Tue Jun 15 19:13:30 2021] RAX: ffffffffc0dc0500 RBX: 0000000000000701
RCX: ffffb904e4c7bac0
[Tue Jun 15 19:13:30 2021] RDX: ffff909fa2bd0000 RSI: ffff908f2fe61e30
RDI: ffff90a63593bca0
[Tue Jun 15 19:13:30 2021] RBP: ffffb904e4c7ba78 R08: 000000000054a7ae
R09: ffff909fa2bd0000
[Tue Jun 15 19:13:30 2021] R10: 0000000000000000 R11: 0000000000001970
R12: ffff90a63593bca0
[Tue Jun 15 19:13:30 2021] R13: 0000000000000000 R14: ffff909fa2bd0000
R15: ffffb904e4c7bac8
[Tue Jun 15 19:13:30 2021] FS:  00007f85da5fc700(0000)
GS:ffff9098ef800000(0000) knlGS:0000000000000000
[Tue Jun 15 19:13:30 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Tue Jun 15 19:13:30 2021] CR2: 000000c420563010 CR3: 0000001fe1698000
CR4: 00000000003406e0
[Tue Jun 15 19:13:30 2021] Call Trace:
[Tue Jun 15 19:13:30 2021]  kvm_mmu_pte_write+0x421/0x430 [kvm]
[Tue Jun 15 19:13:30 2021]  kvm_page_track_write+0x82/0xc0 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_write_phys+0x3b/0x50 [kvm]
[Tue Jun 15 19:13:30 2021]  write_emulate+0xe/0x10 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_read_write_onepage+0xfc/0x320 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_read_write+0xd6/0x190 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_write_emulated+0x15/0x20 [kvm]
[Tue Jun 15 19:13:30 2021]  segmented_write+0x5d/0x80 [kvm]
[Tue Jun 15 19:13:30 2021]  writeback+0x203/0x2e0 [kvm]
[Tue Jun 15 19:13:30 2021]  x86_emulate_insn+0x990/0x1050 [kvm]
[Tue Jun 15 19:13:30 2021]  x86_emulate_instruction+0x350/0x710 [kvm]
[Tue Jun 15 19:13:30 2021]  complete_emulated_pio+0x3f/0x70 [kvm]
[Tue Jun 15 19:13:30 2021]  kvm_arch_vcpu_ioctl_run+0x4cb/0x570 [kvm]
[Tue Jun 15 19:13:30 2021]  kvm_vcpu_ioctl+0x24b/0x610 [kvm]
[Tue Jun 15 19:13:30 2021]  do_vfs_ioctl+0xa9/0x640
[Tue Jun 15 19:13:30 2021]  ? task_numa_work+0x228/0x300
[Tue Jun 15 19:13:30 2021]  ksys_ioctl+0x67/0x90
[Tue Jun 15 19:13:30 2021]  __x64_sys_ioctl+0x1a/0x20
[Tue Jun 15 19:13:30 2021]  do_syscall_64+0x57/0x190
[Tue Jun 15 19:13:30 2021]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Tue Jun 15 19:13:30 2021] RIP: 0033:0x7f8df6dec427
[Tue Jun 15 19:13:30 2021] Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00
26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10
00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8
64 89 01 48
[Tue Jun 15 19:13:30 2021] RSP: 002b:00007f85da5f8c08 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
[Tue Jun 15 19:13:30 2021] RAX: ffffffffffffffda RBX: 000000000000ae80
RCX: 00007f8df6dec427
[Tue Jun 15 19:13:30 2021] RDX: 0000000000000000 RSI: 000000000000ae80
RDI: 0000000000000022
[Tue Jun 15 19:13:30 2021] RBP: 0000000000000000 R08: 000055be0e527f58
R09: 0000000000000000
[Tue Jun 15 19:13:30 2021] R10: 0000000000000001 R11: 0000000000000246
R12: 000055be0f426eb0
[Tue Jun 15 19:13:30 2021] R13: 00007f8dea1f5000 R14: 0000000000000000
R15: 000055be0f426eb0
[Tue Jun 15 19:13:30 2021] ---[ end trace 4e3f65d27e26463c ]---

I'm guessing this is more of a qemu bug / issue, but it does have a lot
of impact on the performance of nested VMs, though it does not crash. I
was wondering if any of proxmox users noticed that with these CPUs /
versions before going upstream…

Regards,

Gilles



             reply	other threads:[~2021-06-15 19:35 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 19:34 Gilles Pietri [this message]
2021-06-24 14:58 ` Gilles Pietri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38cc69ca-b0e9-6ee1-afb6-86a19c0db098@gilouweb.com \
    --to=contact+dev@gilouweb.com \
    --cc=pve-user@pve.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal