From: Roland Kammerer <roland.kammerer@linbit.com>
To: pve-devel@lists.proxmox.com
Subject: [pve-devel] migrate local -> drbd fails with vanished job
Date: Fri, 26 Nov 2021 14:03:57 +0100 [thread overview]
Message-ID: <20211126130357.GS1745@rck.sh> (raw)
Dear PVE devs,
While most of our users start with fresh VMs on DRBD storage, from time
to time people try to migrate a local VM to DRBD storage. This currently fails.
Migrating VMs from DRBD to DRBD works.
I added some debug code to PVE/QemuServer.pm, which looks like the location
things go wrong, or at least where I saw them going wrong:
root@pve:/usr/share/perl5/PVE# diff -Nur QemuServer.pm{.orig,}
--- QemuServer.pm.orig 2021-11-26 11:27:28.879989894 +0100
+++ QemuServer.pm 2021-11-26 11:26:30.490988789 +0100
@@ -7390,6 +7390,8 @@
$completion //= 'complete';
$op //= "mirror";
+ print "$vmid, $vmiddst, $jobs, $completion, $qga, $op \n";
+ { use Data::Dumper; print Dumper($jobs); };
eval {
my $err_complete = 0;
@@ -7419,6 +7421,7 @@
next;
}
+ print "vanished: $vanished\n"; # same as !defined($jobs)
die "$job_id: '$op' has been cancelled\n" if !defined($job);
my $busy = $job->{busy};
With that in place, I try to live migrate the running VM from node "pve" to
"pvf":
2021-11-26 11:29:10 starting migration of VM 100 to node 'pvf' (xx.xx.xx.xx)
2021-11-26 11:29:10 found local disk 'local-lvm:vm-100-disk-0' (in current VM config)
2021-11-26 11:29:10 starting VM 100 on remote node 'pvf'
2021-11-26 11:29:18 volume 'local-lvm:vm-100-disk-0' is 'drbdstorage:vm-100-disk-1' on the target
2021-11-26 11:29:18 start remote tunnel
2021-11-26 11:29:19 ssh tunnel ver 1
2021-11-26 11:29:19 starting storage migration
2021-11-26 11:29:19 scsi0: start migration to nbd:unix:/run/qemu-server/100_nbd.migrate:exportname=drive-scsi0
drive mirror is starting for drive-scsi0
Use of uninitialized value $qga in concatenation (.) or string at /usr/share/perl5/PVE/QemuServer.pm line 7393.
100, 100, HASH(0x557b44474a80), skip, , mirror
$VAR1 = {
'drive-scsi0' => {}
};
vanished: 1
drive-scsi0: Cancelling block job
drive-scsi0: Done.
2021-11-26 11:29:19 ERROR: online migrate failure - block job (mirror) error: drive-scsi0: 'mirror' has been cancelled
2021-11-26 11:29:19 aborting phase 2 - cleanup resources
2021-11-26 11:29:19 migrate_cancel
2021-11-26 11:29:22 ERROR: migration finished with problems (duration 00:00:12)
TASK ERROR: migration problems
What I also see on "pvf" is that the plugin actually creates the DRBD block
device, and "something" even tries to write data to it, as the DRBD devices
auto-promotes to Primary.
Any hints how I can debug that further? The block device should be ready at
that point. What is going on in the background here?
FWIW the plugin can be found here:
https://github.com/linbit/linstor-proxmox
Regards, rck
next reply other threads:[~2021-11-26 13:04 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-26 13:03 Roland Kammerer [this message]
2021-11-26 16:06 ` Fabian Grünbichler
2021-11-29 8:48 ` Roland Kammerer
2021-11-29 8:20 ` Fabian Ebner
2021-11-29 9:03 ` Roland Kammerer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211126130357.GS1745@rck.sh \
--to=roland.kammerer@linbit.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.