public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: aderumier@odiso.com
To: pve-devel <pve-devel@pve.proxmox.com>
Subject: Re: [pve-devel] qemu live migration: bigger downtime recently
Date: Fri, 22 Jan 2021 16:06:26 +0100	[thread overview]
Message-ID: <ca2fa17ffd670a60e72071ff401139c104d1e854.camel@odiso.com> (raw)
In-Reply-To: <b08a69e25cdfe7615bd192ab07b169a0100fadeb.camel@odiso.com>

I have tried to add a log to display the current status state of the
migration,
and It don't catch any "active" state, but "completed" directly.

Here another sample with a bigger downtime of 14s (real downtime, I
have checked with a ping to be sure)



2021-01-22 16:02:53 starting migration of VM 391 to node 'kvm13'
(10.3.94.70)
2021-01-22 16:02:53 starting VM 391 on remote node 'kvm13'
2021-01-22 16:02:55 start remote tunnel
2021-01-22 16:02:56 ssh tunnel ver 1
2021-01-22 16:02:56 starting online/live migration on
tcp:10.3.94.70:60000
2021-01-22 16:02:56 set migration_caps
2021-01-22 16:02:56 migration speed limit: 8589934592 B/s
2021-01-22 16:02:56 migration downtime limit: 100 ms
2021-01-22 16:02:56 migration cachesize: 2147483648 B
2021-01-22 16:02:56 set migration parameters
2021-01-22 16:02:56 start migrate command to tcp:10.3.94.70:60000



2021-01-22 16:03:11 status: completed ---> added log
2021-01-22 16:03:11 migration speed: 1092.27 MB/s - downtime 14424 ms
2021-01-22 16:03:11 migration status: completed
2021-01-22 16:03:14 migration finished successfully (duration 00:00:21)
TASK OK



    my $merr = $@;
    $self->log('info', "migrate uri => $ruri failed: $merr") if $merr;

    my $lstat = 0;
    my $usleep = 1000000;
    my $i = 0;
    my $err_count = 0;
    my $lastrem = undef;
    my $downtimecounter = 0;
    while (1) {
        $i++;
        my $avglstat = $lstat ? $lstat / $i : 0;

        usleep($usleep);
        my $stat;
        eval {
            $stat = mon_cmd($vmid, "query-migrate");
        };
        if (my $err = $@) {
            $err_count++;
            warn "query migrate failed: $err\n";
            $self->log('info', "query migrate failed: $err");
            if ($err_count <= 5) {
                usleep(1000000);
                next;
            }
            die "too many query migrate failures - aborting\n";
        }

        $self->log('info', "status: $stat->{status}");   ---> added log


Le vendredi 22 janvier 2021 à 15:34 +0100, aderumier@odiso.com a
écrit :
> Hi,
> 
> I have notice recently bigger downtime on qemu live migration.
> (I'm not sure if it's after qemu update or qemu-server update)
> 
> migration: type=insecure
> 
>  qemu-server                          6.3-2  
>  pve-qemu-kvm                         5.1.0-7   
> 
> (I'm not sure about the machine running qemu version)
> 
> 
> 
> Here a sample:
> 
> 
> 
> 2021-01-22 15:28:38 starting migration of VM 226 to node 'kvm13'
> (10.3.94.70)
> 2021-01-22 15:28:42 starting VM 226 on remote node 'kvm13'
> 2021-01-22 15:28:44 start remote tunnel
> 2021-01-22 15:28:45 ssh tunnel ver 1
> 2021-01-22 15:28:45 starting online/live migration on
> tcp:10.3.94.70:60000
> 2021-01-22 15:28:45 set migration_caps
> 2021-01-22 15:28:45 migration speed limit: 8589934592 B/s
> 2021-01-22 15:28:45 migration downtime limit: 100 ms
> 2021-01-22 15:28:45 migration cachesize: 268435456 B
> 2021-01-22 15:28:45 set migration parameters
> 2021-01-22 15:28:45 start migrate command to tcp:10.3.94.70:60000
> 2021-01-22 15:28:47 migration speed: 1024.00 MB/s - downtime 2117 ms
> 2021-01-22 15:28:47 migration status: completed
> 2021-01-22 15:28:51 migration finished successfully (duration
> 00:00:13)
> TASK OK
> 
> That's strange because I don't see the memory transfert loop logs
> 
> 
> 
> Migrate back to original host is working
> 
> 2021-01-22 15:29:34 starting migration of VM 226 to node 'kvm2'
> (::ffff:10.3.94.50)
> 2021-01-22 15:29:36 starting VM 226 on remote node 'kvm2'
> 2021-01-22 15:29:39 start remote tunnel
> 2021-01-22 15:29:40 ssh tunnel ver 1
> 2021-01-22 15:29:40 starting online/live migration on
> tcp:[::ffff:10.3.94.50]:60000
> 2021-01-22 15:29:40 set migration_caps
> 2021-01-22 15:29:40 migration speed limit: 8589934592 B/s
> 2021-01-22 15:29:40 migration downtime limit: 100 ms
> 2021-01-22 15:29:40 migration cachesize: 268435456 B
> 2021-01-22 15:29:40 set migration parameters
> 2021-01-22 15:29:40 start migrate command to
> tcp:[::ffff:10.3.94.50]:60000
> 2021-01-22 15:29:41 migration status: active (transferred 396107554,
> remaining 1732018176), total 2165383168)
> 2021-01-22 15:29:41 migration xbzrle cachesize: 268435456 transferred
> 0
> pages 0 cachemiss 0 overflow 0
> 2021-01-22 15:29:42 migration status: active (transferred 973010921,
> remaining 1089216512), total 2165383168)
> 2021-01-22 15:29:42 migration xbzrle cachesize: 268435456 transferred
> 0
> pages 0 cachemiss 0 overflow 0
> 2021-01-22 15:29:43 migration status: active (transferred 1511925476,
> remaining 483463168), total 2165383168)
> 2021-01-22 15:29:43 migration xbzrle cachesize: 268435456 transferred
> 0
> pages 0 cachemiss 0 overflow 0
> 2021-01-22 15:29:44 migration speed: 512.00 MB/s - downtime 148 ms
> 2021-01-22 15:29:44 migration status: completed
> 2021-01-22 15:29:47 migration finished successfully (duration
> 00:00:13)
> TASK OK
> 
> 
> Then migrate it again like the first migration is working too
> 
> 
> 2021-01-22 15:31:07 starting migration of VM 226 to node 'kvm13'
> (10.3.94.70)
> 2021-01-22 15:31:10 starting VM 226 on remote node 'kvm13'
> 2021-01-22 15:31:12 start remote tunnel
> 2021-01-22 15:31:13 ssh tunnel ver 1
> 2021-01-22 15:31:13 starting online/live migration on
> tcp:10.3.94.70:60000
> 2021-01-22 15:31:13 set migration_caps
> 2021-01-22 15:31:13 migration speed limit: 8589934592 B/s
> 2021-01-22 15:31:13 migration downtime limit: 100 ms
> 2021-01-22 15:31:13 migration cachesize: 268435456 B
> 2021-01-22 15:31:13 set migration parameters
> 2021-01-22 15:31:13 start migrate command to tcp:10.3.94.70:60000
> 2021-01-22 15:31:14 migration status: active (transferred 1092088188,
> remaining 944365568), total 2165383168)
> 2021-01-22 15:31:14 migration xbzrle cachesize: 268435456 transferred
> 0
> pages 0 cachemiss 0 overflow 0
> 2021-01-22 15:31:15 migration speed: 1024.00 MB/s - downtime 55 ms
> 2021-01-22 15:31:15 migration status: completed
> 2021-01-22 15:31:19 migration finished successfully (duration
> 00:00:12)
> TASK OK
> 
> 
> Any idea ? Maybe a specific qemu version bug ?
> 
> 
> 
> 





  reply	other threads:[~2021-01-22 15:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-22 14:34 aderumier
2021-01-22 15:06 ` aderumier [this message]
2021-01-22 18:55   ` aderumier
2021-01-23  8:38     ` aderumier
2021-01-25  8:47       ` Fabian Grünbichler
2021-01-25  9:26         ` aderumier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca2fa17ffd670a60e72071ff401139c104d1e854.camel@odiso.com \
    --to=aderumier@odiso.com \
    --cc=pve-devel@pve.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal