Re: [pve-devel] migrate local -> drbd fails with vanished job

From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] migrate local -> drbd fails with vanished job
Date: Fri, 26 Nov 2021 17:06:05 +0100	[thread overview]
Message-ID: <1637941727.qelq78do1a.astroid@nora.none> (raw)
In-Reply-To: <20211126130357.GS1745@rck.sh>

On November 26, 2021 2:03 pm, Roland Kammerer wrote:
> Dear PVE devs,

I'll add some notes inline to tell you other places where to continue 
debugging

> 
> While most of our users start with fresh VMs on DRBD storage, from time
> to time people try to migrate a local VM to DRBD storage. This currently fails.
> Migrating VMs from DRBD to DRBD works.
> 
> I added some debug code to PVE/QemuServer.pm, which looks like the location
> things go wrong, or at least where I saw them going wrong:
> 
> root@pve:/usr/share/perl5/PVE# diff -Nur QemuServer.pm{.orig,}
> --- QemuServer.pm.orig  2021-11-26 11:27:28.879989894 +0100
> +++ QemuServer.pm       2021-11-26 11:26:30.490988789 +0100
> @@ -7390,6 +7390,8 @@
>      $completion //= 'complete';
>      $op //= "mirror";
> 
> +    print "$vmid, $vmiddst, $jobs, $completion, $qga, $op \n";
> +    { use Data::Dumper; print Dumper($jobs); };
>      eval {
>         my $err_complete = 0;
> 
> @@ -7419,6 +7421,7 @@
>                     next;
>                 }
> 
> +               print "vanished: $vanished\n"; # same as !defined($jobs)
>                 die "$job_id: '$op' has been cancelled\n" if !defined($job);
> 
>                 my $busy = $job->{busy};
> 
> 
> With that in place, I try to live migrate the running VM from node "pve" to
> "pvf":
> 
> 2021-11-26 11:29:10 starting migration of VM 100 to node 'pvf' (xx.xx.xx.xx)
> 2021-11-26 11:29:10 found local disk 'local-lvm:vm-100-disk-0' (in current VM config)
> 2021-11-26 11:29:10 starting VM 100 on remote node 'pvf'
> 2021-11-26 11:29:18 volume 'local-lvm:vm-100-disk-0' is 'drbdstorage:vm-100-disk-1' on the target

so at this point the VM is started on the remove node (with some special 
parameters for migration), and it gave us the info that it allocated a 
volume for the storage migration. for that volume an nbd server/export 
has been started, the target VM itself is suspended at this point and 
will only resume once all the data (disk and RAM/device state) has been 
transferred and the source VM has been suspended in turn.

> 2021-11-26 11:29:18 start remote tunnel
> 2021-11-26 11:29:19 ssh tunnel ver 1
> 2021-11-26 11:29:19 starting storage migration

connected over SSH, forwarding the sockets/ports for storage and 
RAM/state migration

> 2021-11-26 11:29:19 scsi0: start migration to nbd:unix:/run/qemu-server/100_nbd.migrate:exportname=drive-scsi0
> drive mirror is starting for drive-scsi0

drive-mirror started for the drive that changes storage

> Use of uninitialized value $qga in concatenation (.) or string at /usr/share/perl5/PVE/QemuServer.pm line 7393.

expected

> 100, 100, HASH(0x557b44474a80), skip, , mirror
> $VAR1 = {
>           'drive-scsi0' => {}
>         };

this looks okay (not you could just pass @_ to Dumper to print all args, 
handling undef as well ;))

> vanished: 1

but the job is not there when we ask qemu about it (query-block-jobs). 
this should normally return either a still-running job, or one that has 
converged so we can keep it in that state, proceed to the next disk (or 
VM state if it was the last one), and then finally tell all those 
converged mirror jobs to switch over to the target node when we 
have finished transferring the state (PVE::QemuMigrate).

so the question is whether the drive-mirror even successfully starts?

> drive-scsi0: Cancelling block job
> drive-scsi0: Done.
> 2021-11-26 11:29:19 ERROR: online migrate failure - block job (mirror) error: drive-scsi0: 'mirror' has been cancelled

here the log order is a bit confusing - the error handler cancelling the 
block job is run in any case if an error occurs, and then the original 
error (has been cancelled, which actually is 'has vanished' here), is 
printed last. so just ignore the Cancelling part, we don't actively 
cancel in this case, because there is no job to cancel ;)

> 2021-11-26 11:29:19 aborting phase 2 - cleanup resources
> 2021-11-26 11:29:19 migrate_cancel
> 2021-11-26 11:29:22 ERROR: migration finished with problems (duration 00:00:12)
> TASK ERROR: migration problems
> 
> What I also see on "pvf" is that the plugin actually creates the DRBD block
> device, and "something" even tries to write data to it, as the DRBD devices
> auto-promotes to Primary.

that might just be the start codepath allocating the volume and then 
starting the VM suspended with an NBD server..
> 
> Any hints how I can debug that further? The block device should be ready at
> that point. What is going on in the background here?

are there any messages on either end in the journal? you can of course 
add a sleep at the start of qemu_drive_mirror_monitor, and interact with 
the VM via QMP yourself to see which state the disks and block jobs are.

in case there is nothing (or nothing meaningful) there, it's also 
possible to use Qemu's trace mechanism to get a feeling for how far it 
gets (you can set it up via the 'args' config option in the VM config)

> 
> FWIW the plugin can be found here:
> https://github.com/linbit/linstor-proxmox
> 
> Regards, rck
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
>