From: Mike O'Connor <mike@oeg.com.au>
To: Proxmox VE user list <pve-user@lists.proxmox.com>,
Iztok Gregori <iztok.gregori@elettra.eu>
Subject: Re: [PVE-User] Unresponsive VM(s) during VZdump
Date: Thu, 9 May 2024 19:00:50 +0930 [thread overview]
Message-ID: <6598939f-2be1-45a9-8cc5-c9c473373c29@oeg.com.au> (raw)
In-Reply-To: <738ba899-5a52-4000-ba61-83dd0e360df4@elettra.eu>
Hi Iztok
You need to enable fleecing in the advanced backup settings. A slow
backup storage system will cause this issue, configuring fleecing will
fix this by storing changes in a local sparse image.
Mike
On 9/5/2024 6:05 pm, Iztok Gregori wrote:
> Hi to all!
>
> We are in the process of upgrading our Hyper-converged (Ceph based)
> cluster from PVE 6 to PVE 8 and yesterday we finished upgrading all
> nodes to PVE 7.4.1 without issues. Tonight, during our usual VZdump
> backup (vzdump on NFS share), we were notified by our monitoring
> system that 2 VMs (of 107) were unresponsive. In the VM logs there
> were a lot of lines like this:
>
> kernel: hda: irq timeout: status=0xd0 { Busy }
> kernel: sd 2:0:0:0: [sda] abort
> kernel: sd 2:0:0:1: [sdb] abort
>
> After (successfully) finish the backup, the VM started to function
> correctly again.
>
> On PVE 6 everything was ok.
>
> The affected machines are running old kernels "2.6.18" and "2.6.32",
> one has qemu agent enabled the other has not. Both are using kvm64 as
> processor type, one is using "Virtio Scsi" the other "LSI 53C895A".
> All the disks are on Ceph RBD.
>
> No related logs were logged on the host machine, the Ceph cluster was
> working as expected. Both VM are "biggish" 100-200GB and it takes 1/2
> hours to complete the backup.
>
> Have you any idea what could be the culprit of the problem? I suspect
> something with qemu-kvm, but I didn't find (yet) any usefull hints.
>
> I'm still planning to upgrade everything to PVE 8, maybe the "problem"
> was fixed in later releases of qemu-kvm...
>
> I can give you more information if needed, any help is appreciated.
>
> Thanks
> Iztok
>
> P.S This is the software stack on our cluster (16 nodes):
> # pveversion -v
> proxmox-ve: 7.4-1 (running kernel: 5.15.149-1-pve)
> pve-manager: 7.4-17 (running version: 7.4-17/513c62be)
> pve-kernel-5.15: 7.4-12
> pve-kernel-5.4: 6.4-20
> pve-kernel-5.15.149-1-pve: 5.15.149-1
> pve-kernel-5.4.203-1-pve: 5.4.203-1
> pve-kernel-5.4.157-1-pve: 5.4.157-1
> pve-kernel-5.4.106-1-pve: 5.4.106-1
> ceph: 15.2.17-pve1
> ceph-fuse: 15.2.17-pve1
> corosync: 3.1.7-pve1
> criu: 3.15-1+pve-1
> glusterfs-client: 9.2-1
> ifupdown: 0.8.36+pve2
> ksm-control-daemon: 1.4-1
> libjs-extjs: 7.0.0-1
> libknet1: 1.24-pve2
> libproxmox-acme-perl: 1.4.4
> libproxmox-backup-qemu0: 1.3.1-1
> libproxmox-rs-perl: 0.2.1
> libpve-access-control: 7.4.3
> libpve-apiclient-perl: 3.2-2
> libpve-common-perl: 7.4-2
> libpve-guest-common-perl: 4.2-4
> libpve-http-server-perl: 4.2-3
> libpve-rs-perl: 0.7.7
> libpve-storage-perl: 7.4-3
> libqb0: 1.0.5-1
> libspice-server1: 0.14.3-2.1
> lvm2: 2.03.11-2.1
> lxc-pve: 5.0.2-2
> lxcfs: 5.0.3-pve1
> novnc-pve: 1.4.0-1
> proxmox-backup-client: 2.4.6-1
> proxmox-backup-file-restore: 2.4.6-1
> proxmox-kernel-helper: 7.4-1
> proxmox-mail-forward: 0.1.1-1
> proxmox-mini-journalreader: 1.3-1
> proxmox-offline-mirror-helper: 0.5.2
> proxmox-widget-toolkit: 3.7.3
> pve-cluster: 7.3-3
> pve-container: 4.4-6
> pve-docs: 7.4-2
> pve-edk2-firmware: 3.20230228-4~bpo11+3
> pve-firewall: 4.3-5
> pve-firmware: 3.6-6
> pve-ha-manager: 3.6.1
> pve-i18n: 2.12-1
> pve-qemu-kvm: 7.2.10-1
> pve-xtermjs: 4.16.0-2
> qemu-server: 7.4-5
> smartmontools: 7.2-pve3
> spiceterm: 3.2-2
> swtpm: 0.8.0~bpo11+3
> vncterm: 1.7-1
> zfsutils-linux: 2.1.15-pve1
>
_______________________________________________
pve-user mailing list
pve-user@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
next prev parent reply other threads:[~2024-05-09 9:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-09 8:35 Iztok Gregori
2024-05-09 9:30 ` Mike O'Connor [this message]
2024-05-09 10:02 ` Iztok Gregori
2024-05-09 10:11 ` Mike O'Connor
2024-05-09 11:24 ` Alexander Burke via pve-user
[not found] ` <11db3d6f-1879-44b3-9f99-01e6fde6ebc8@alexburke.ca>
2024-05-10 5:21 ` Mike O'Connor
2024-05-10 9:07 ` Fiona Ebner
2024-05-10 7:36 ` Iztok Gregori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6598939f-2be1-45a9-8cc5-c9c473373c29@oeg.com.au \
--to=mike@oeg.com.au \
--cc=iztok.gregori@elettra.eu \
--cc=pve-user@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox