all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: Gilles Pietri <contact+dev@gilouweb.com>
To: "proxmoxve (pve-user@pve.proxmox.com)" <pve-user@pve.proxmox.com>
Subject: [PVE-User] Nested KVM on AMD EPYC processor, oops
Date: Tue, 15 Jun 2021 21:34:42 +0200	[thread overview]
Message-ID: <38cc69ca-b0e9-6ee1-afb6-86a19c0db098@gilouweb.com> (raw)

Hi,

I'm running qemu (through openstack) on a proxmox instance running:
# pveversion -v
proxmox-ve: 6.4-1 (running kernel: 5.4.119-1-pve)
pve-manager: 6.4-8 (running version: 6.4-8/185e14db)
pve-kernel-5.4: 6.4-3
pve-kernel-helper: 6.4-3
[...]
pve-qemu-kvm: 5.2.0-6
qemu-server: 6.4-2


The qemu version in Openstack (Wallaby) is
$ qemu-system-x86_64 -version
QEMU emulator version 4.2.1 (Debian 1:4.2-3ubuntu6.16)
Copyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers

The VM is running using the host CPU, which is AMD EPYC 7451 24-Core
with relevant parameters here:
cores: 8
cpu: host
memory: 32768
numa: 0
rng0: source=/dev/urandom
sockets: 1


And I get spammed a lot with that kind of traces:

[Tue Jun 15 19:13:29 2021] ------------[ cut here ]------------
[Tue Jun 15 19:13:29 2021] WARNING: CPU: 6 PID: 47530 at
arch/x86/kvm/mmu.c:2250 nonpaging_update_pte+0x9/0x10 [kvm]
[Tue Jun 15 19:13:30 2021] Modules linked in: xt_nat
nf_conntrack_netlink tcp_diag inet_diag xt_MASQUERADE xfrm_user
iptable_nat nf_nat overlay binfmt_misc rpcsec_gss_krb5 auth_rpcgss nfsv4
nfs lockd grace fscache sctp veth ebt_arp ebtable_filter ebtables
ip6table_raw ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables
iptable_raw xt_mac ipt_REJECT nf_reject_ipv4 xt_mark xt_set xt_physdev
xt_addrtype xt_comment xt_multiport xt_conntrack nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp ip_set_hash_net ip_set
iptable_filter bpfilter softdog nfnetlink_log nfnetlink ipmi_ssif
amd64_edac_mod edac_mce_amd kvm_amd kvm drm_vram_helper irqbypass ttm
crct10dif_pclmul crc32_pclmul drm_kms_helper ghash_clmulni_intel drm
aesni_intel i2c_algo_bit crypto_simd fb_sys_fops cryptd syscopyarea
sysfillrect glue_helper k10temp sysimgblt ccp ipmi_si ipmi_devintf
ipmi_msghandler mac_hid zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO)
icp(PO) zcommon(PO) znvpair(PO) spl(O) vhost_net vhost tap ib_iser
rdma_cm iw_cm ib_cm
[Tue Jun 15 19:13:30 2021]  ib_core iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi sunrpc ip_tables x_tables autofs4 dm_thin_pool
dm_persistent_data dm_bio_prison dm_bufio raid10 raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
libcrc32c raid0 multipath linear ixgbe ahci xhci_pci xfrm_algo raid1
i2c_piix4 dca libahci xhci_hcd mdio
[Tue Jun 15 19:13:30 2021] CPU: 6 PID: 47530 Comm: kvm Tainted: P
W  O      5.4.119-1-pve #1
[Tue Jun 15 19:13:30 2021] Hardware name: empty empty/S8026GM2NRE-HOV-B,
BIOS V8.711 07/09/2020
[Tue Jun 15 19:13:30 2021] RIP: 0010:nonpaging_update_pte+0x9/0x10 [kvm]
[Tue Jun 15 19:13:30 2021] Code: 00 0f 1f 44 00 00 55 31 c0 48 89 e5 5d
c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 5d c3 0f 1f 44 00 00 0f 1f 44 00
00 55 48 89 e5 <0f> 0b 5d c3 0f 1f 00 0f 1f 44 00 00 31 f6 48 8b 04 77
48 63 54 37
[Tue Jun 15 19:13:30 2021] RSP: 0018:ffffb904e4c7ba78 EFLAGS: 00010202
[Tue Jun 15 19:13:30 2021] RAX: ffffffffc0dc0500 RBX: 0000000000000701
RCX: ffffb904e4c7bac0
[Tue Jun 15 19:13:30 2021] RDX: ffff909fa2bd0000 RSI: ffff908f2fe61e30
RDI: ffff90a63593bca0
[Tue Jun 15 19:13:30 2021] RBP: ffffb904e4c7ba78 R08: 000000000054a7ae
R09: ffff909fa2bd0000
[Tue Jun 15 19:13:30 2021] R10: 0000000000000000 R11: 0000000000001970
R12: ffff90a63593bca0
[Tue Jun 15 19:13:30 2021] R13: 0000000000000000 R14: ffff909fa2bd0000
R15: ffffb904e4c7bac8
[Tue Jun 15 19:13:30 2021] FS:  00007f85da5fc700(0000)
GS:ffff9098ef800000(0000) knlGS:0000000000000000
[Tue Jun 15 19:13:30 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Tue Jun 15 19:13:30 2021] CR2: 000000c420563010 CR3: 0000001fe1698000
CR4: 00000000003406e0
[Tue Jun 15 19:13:30 2021] Call Trace:
[Tue Jun 15 19:13:30 2021]  kvm_mmu_pte_write+0x421/0x430 [kvm]
[Tue Jun 15 19:13:30 2021]  kvm_page_track_write+0x82/0xc0 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_write_phys+0x3b/0x50 [kvm]
[Tue Jun 15 19:13:30 2021]  write_emulate+0xe/0x10 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_read_write_onepage+0xfc/0x320 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_read_write+0xd6/0x190 [kvm]
[Tue Jun 15 19:13:30 2021]  emulator_write_emulated+0x15/0x20 [kvm]
[Tue Jun 15 19:13:30 2021]  segmented_write+0x5d/0x80 [kvm]
[Tue Jun 15 19:13:30 2021]  writeback+0x203/0x2e0 [kvm]
[Tue Jun 15 19:13:30 2021]  x86_emulate_insn+0x990/0x1050 [kvm]
[Tue Jun 15 19:13:30 2021]  x86_emulate_instruction+0x350/0x710 [kvm]
[Tue Jun 15 19:13:30 2021]  complete_emulated_pio+0x3f/0x70 [kvm]
[Tue Jun 15 19:13:30 2021]  kvm_arch_vcpu_ioctl_run+0x4cb/0x570 [kvm]
[Tue Jun 15 19:13:30 2021]  kvm_vcpu_ioctl+0x24b/0x610 [kvm]
[Tue Jun 15 19:13:30 2021]  do_vfs_ioctl+0xa9/0x640
[Tue Jun 15 19:13:30 2021]  ? task_numa_work+0x228/0x300
[Tue Jun 15 19:13:30 2021]  ksys_ioctl+0x67/0x90
[Tue Jun 15 19:13:30 2021]  __x64_sys_ioctl+0x1a/0x20
[Tue Jun 15 19:13:30 2021]  do_syscall_64+0x57/0x190
[Tue Jun 15 19:13:30 2021]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Tue Jun 15 19:13:30 2021] RIP: 0033:0x7f8df6dec427
[Tue Jun 15 19:13:30 2021] Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00
26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10
00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8
64 89 01 48
[Tue Jun 15 19:13:30 2021] RSP: 002b:00007f85da5f8c08 EFLAGS: 00000246
ORIG_RAX: 0000000000000010
[Tue Jun 15 19:13:30 2021] RAX: ffffffffffffffda RBX: 000000000000ae80
RCX: 00007f8df6dec427
[Tue Jun 15 19:13:30 2021] RDX: 0000000000000000 RSI: 000000000000ae80
RDI: 0000000000000022
[Tue Jun 15 19:13:30 2021] RBP: 0000000000000000 R08: 000055be0e527f58
R09: 0000000000000000
[Tue Jun 15 19:13:30 2021] R10: 0000000000000001 R11: 0000000000000246
R12: 000055be0f426eb0
[Tue Jun 15 19:13:30 2021] R13: 00007f8dea1f5000 R14: 0000000000000000
R15: 000055be0f426eb0
[Tue Jun 15 19:13:30 2021] ---[ end trace 4e3f65d27e26463c ]---

I'm guessing this is more of a qemu bug / issue, but it does have a lot
of impact on the performance of nested VMs, though it does not crash. I
was wondering if any of proxmox users noticed that with these CPUs /
versions before going upstream…

Regards,

Gilles



             reply	other threads:[~2021-06-15 19:35 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-15 19:34 Gilles Pietri [this message]
2021-06-24 14:58 ` Gilles Pietri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38cc69ca-b0e9-6ee1-afb6-86a19c0db098@gilouweb.com \
    --to=contact+dev@gilouweb.com \
    --cc=pve-user@pve.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal