* [pve-devel] qemu live migration: bigger downtime recently @ 2021-01-22 14:34 aderumier 2021-01-22 15:06 ` aderumier 0 siblings, 1 reply; 6+ messages in thread From: aderumier @ 2021-01-22 14:34 UTC (permalink / raw) To: pve-devel Hi, I have notice recently bigger downtime on qemu live migration. (I'm not sure if it's after qemu update or qemu-server update) migration: type=insecure qemu-server 6.3-2 pve-qemu-kvm 5.1.0-7 (I'm not sure about the machine running qemu version) Here a sample: 2021-01-22 15:28:38 starting migration of VM 226 to node 'kvm13' (10.3.94.70) 2021-01-22 15:28:42 starting VM 226 on remote node 'kvm13' 2021-01-22 15:28:44 start remote tunnel 2021-01-22 15:28:45 ssh tunnel ver 1 2021-01-22 15:28:45 starting online/live migration on tcp:10.3.94.70:60000 2021-01-22 15:28:45 set migration_caps 2021-01-22 15:28:45 migration speed limit: 8589934592 B/s 2021-01-22 15:28:45 migration downtime limit: 100 ms 2021-01-22 15:28:45 migration cachesize: 268435456 B 2021-01-22 15:28:45 set migration parameters 2021-01-22 15:28:45 start migrate command to tcp:10.3.94.70:60000 2021-01-22 15:28:47 migration speed: 1024.00 MB/s - downtime 2117 ms 2021-01-22 15:28:47 migration status: completed 2021-01-22 15:28:51 migration finished successfully (duration 00:00:13) TASK OK That's strange because I don't see the memory transfert loop logs Migrate back to original host is working 2021-01-22 15:29:34 starting migration of VM 226 to node 'kvm2' (::ffff:10.3.94.50) 2021-01-22 15:29:36 starting VM 226 on remote node 'kvm2' 2021-01-22 15:29:39 start remote tunnel 2021-01-22 15:29:40 ssh tunnel ver 1 2021-01-22 15:29:40 starting online/live migration on tcp:[::ffff:10.3.94.50]:60000 2021-01-22 15:29:40 set migration_caps 2021-01-22 15:29:40 migration speed limit: 8589934592 B/s 2021-01-22 15:29:40 migration downtime limit: 100 ms 2021-01-22 15:29:40 migration cachesize: 268435456 B 2021-01-22 15:29:40 set migration parameters 2021-01-22 15:29:40 start migrate command to tcp:[::ffff:10.3.94.50]:60000 2021-01-22 15:29:41 migration status: active (transferred 396107554, remaining 1732018176), total 2165383168) 2021-01-22 15:29:41 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0 2021-01-22 15:29:42 migration status: active (transferred 973010921, remaining 1089216512), total 2165383168) 2021-01-22 15:29:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0 2021-01-22 15:29:43 migration status: active (transferred 1511925476, remaining 483463168), total 2165383168) 2021-01-22 15:29:43 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0 2021-01-22 15:29:44 migration speed: 512.00 MB/s - downtime 148 ms 2021-01-22 15:29:44 migration status: completed 2021-01-22 15:29:47 migration finished successfully (duration 00:00:13) TASK OK Then migrate it again like the first migration is working too 2021-01-22 15:31:07 starting migration of VM 226 to node 'kvm13' (10.3.94.70) 2021-01-22 15:31:10 starting VM 226 on remote node 'kvm13' 2021-01-22 15:31:12 start remote tunnel 2021-01-22 15:31:13 ssh tunnel ver 1 2021-01-22 15:31:13 starting online/live migration on tcp:10.3.94.70:60000 2021-01-22 15:31:13 set migration_caps 2021-01-22 15:31:13 migration speed limit: 8589934592 B/s 2021-01-22 15:31:13 migration downtime limit: 100 ms 2021-01-22 15:31:13 migration cachesize: 268435456 B 2021-01-22 15:31:13 set migration parameters 2021-01-22 15:31:13 start migrate command to tcp:10.3.94.70:60000 2021-01-22 15:31:14 migration status: active (transferred 1092088188, remaining 944365568), total 2165383168) 2021-01-22 15:31:14 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0 2021-01-22 15:31:15 migration speed: 1024.00 MB/s - downtime 55 ms 2021-01-22 15:31:15 migration status: completed 2021-01-22 15:31:19 migration finished successfully (duration 00:00:12) TASK OK Any idea ? Maybe a specific qemu version bug ? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] qemu live migration: bigger downtime recently 2021-01-22 14:34 [pve-devel] qemu live migration: bigger downtime recently aderumier @ 2021-01-22 15:06 ` aderumier 2021-01-22 18:55 ` aderumier 0 siblings, 1 reply; 6+ messages in thread From: aderumier @ 2021-01-22 15:06 UTC (permalink / raw) To: pve-devel I have tried to add a log to display the current status state of the migration, and It don't catch any "active" state, but "completed" directly. Here another sample with a bigger downtime of 14s (real downtime, I have checked with a ping to be sure) 2021-01-22 16:02:53 starting migration of VM 391 to node 'kvm13' (10.3.94.70) 2021-01-22 16:02:53 starting VM 391 on remote node 'kvm13' 2021-01-22 16:02:55 start remote tunnel 2021-01-22 16:02:56 ssh tunnel ver 1 2021-01-22 16:02:56 starting online/live migration on tcp:10.3.94.70:60000 2021-01-22 16:02:56 set migration_caps 2021-01-22 16:02:56 migration speed limit: 8589934592 B/s 2021-01-22 16:02:56 migration downtime limit: 100 ms 2021-01-22 16:02:56 migration cachesize: 2147483648 B 2021-01-22 16:02:56 set migration parameters 2021-01-22 16:02:56 start migrate command to tcp:10.3.94.70:60000 2021-01-22 16:03:11 status: completed ---> added log 2021-01-22 16:03:11 migration speed: 1092.27 MB/s - downtime 14424 ms 2021-01-22 16:03:11 migration status: completed 2021-01-22 16:03:14 migration finished successfully (duration 00:00:21) TASK OK my $merr = $@; $self->log('info', "migrate uri => $ruri failed: $merr") if $merr; my $lstat = 0; my $usleep = 1000000; my $i = 0; my $err_count = 0; my $lastrem = undef; my $downtimecounter = 0; while (1) { $i++; my $avglstat = $lstat ? $lstat / $i : 0; usleep($usleep); my $stat; eval { $stat = mon_cmd($vmid, "query-migrate"); }; if (my $err = $@) { $err_count++; warn "query migrate failed: $err\n"; $self->log('info', "query migrate failed: $err"); if ($err_count <= 5) { usleep(1000000); next; } die "too many query migrate failures - aborting\n"; } $self->log('info', "status: $stat->{status}"); ---> added log Le vendredi 22 janvier 2021 à 15:34 +0100, aderumier@odiso.com a écrit : > Hi, > > I have notice recently bigger downtime on qemu live migration. > (I'm not sure if it's after qemu update or qemu-server update) > > migration: type=insecure > > qemu-server 6.3-2 > pve-qemu-kvm 5.1.0-7 > > (I'm not sure about the machine running qemu version) > > > > Here a sample: > > > > 2021-01-22 15:28:38 starting migration of VM 226 to node 'kvm13' > (10.3.94.70) > 2021-01-22 15:28:42 starting VM 226 on remote node 'kvm13' > 2021-01-22 15:28:44 start remote tunnel > 2021-01-22 15:28:45 ssh tunnel ver 1 > 2021-01-22 15:28:45 starting online/live migration on > tcp:10.3.94.70:60000 > 2021-01-22 15:28:45 set migration_caps > 2021-01-22 15:28:45 migration speed limit: 8589934592 B/s > 2021-01-22 15:28:45 migration downtime limit: 100 ms > 2021-01-22 15:28:45 migration cachesize: 268435456 B > 2021-01-22 15:28:45 set migration parameters > 2021-01-22 15:28:45 start migrate command to tcp:10.3.94.70:60000 > 2021-01-22 15:28:47 migration speed: 1024.00 MB/s - downtime 2117 ms > 2021-01-22 15:28:47 migration status: completed > 2021-01-22 15:28:51 migration finished successfully (duration > 00:00:13) > TASK OK > > That's strange because I don't see the memory transfert loop logs > > > > Migrate back to original host is working > > 2021-01-22 15:29:34 starting migration of VM 226 to node 'kvm2' > (::ffff:10.3.94.50) > 2021-01-22 15:29:36 starting VM 226 on remote node 'kvm2' > 2021-01-22 15:29:39 start remote tunnel > 2021-01-22 15:29:40 ssh tunnel ver 1 > 2021-01-22 15:29:40 starting online/live migration on > tcp:[::ffff:10.3.94.50]:60000 > 2021-01-22 15:29:40 set migration_caps > 2021-01-22 15:29:40 migration speed limit: 8589934592 B/s > 2021-01-22 15:29:40 migration downtime limit: 100 ms > 2021-01-22 15:29:40 migration cachesize: 268435456 B > 2021-01-22 15:29:40 set migration parameters > 2021-01-22 15:29:40 start migrate command to > tcp:[::ffff:10.3.94.50]:60000 > 2021-01-22 15:29:41 migration status: active (transferred 396107554, > remaining 1732018176), total 2165383168) > 2021-01-22 15:29:41 migration xbzrle cachesize: 268435456 transferred > 0 > pages 0 cachemiss 0 overflow 0 > 2021-01-22 15:29:42 migration status: active (transferred 973010921, > remaining 1089216512), total 2165383168) > 2021-01-22 15:29:42 migration xbzrle cachesize: 268435456 transferred > 0 > pages 0 cachemiss 0 overflow 0 > 2021-01-22 15:29:43 migration status: active (transferred 1511925476, > remaining 483463168), total 2165383168) > 2021-01-22 15:29:43 migration xbzrle cachesize: 268435456 transferred > 0 > pages 0 cachemiss 0 overflow 0 > 2021-01-22 15:29:44 migration speed: 512.00 MB/s - downtime 148 ms > 2021-01-22 15:29:44 migration status: completed > 2021-01-22 15:29:47 migration finished successfully (duration > 00:00:13) > TASK OK > > > Then migrate it again like the first migration is working too > > > 2021-01-22 15:31:07 starting migration of VM 226 to node 'kvm13' > (10.3.94.70) > 2021-01-22 15:31:10 starting VM 226 on remote node 'kvm13' > 2021-01-22 15:31:12 start remote tunnel > 2021-01-22 15:31:13 ssh tunnel ver 1 > 2021-01-22 15:31:13 starting online/live migration on > tcp:10.3.94.70:60000 > 2021-01-22 15:31:13 set migration_caps > 2021-01-22 15:31:13 migration speed limit: 8589934592 B/s > 2021-01-22 15:31:13 migration downtime limit: 100 ms > 2021-01-22 15:31:13 migration cachesize: 268435456 B > 2021-01-22 15:31:13 set migration parameters > 2021-01-22 15:31:13 start migrate command to tcp:10.3.94.70:60000 > 2021-01-22 15:31:14 migration status: active (transferred 1092088188, > remaining 944365568), total 2165383168) > 2021-01-22 15:31:14 migration xbzrle cachesize: 268435456 transferred > 0 > pages 0 cachemiss 0 overflow 0 > 2021-01-22 15:31:15 migration speed: 1024.00 MB/s - downtime 55 ms > 2021-01-22 15:31:15 migration status: completed > 2021-01-22 15:31:19 migration finished successfully (duration > 00:00:12) > TASK OK > > > Any idea ? Maybe a specific qemu version bug ? > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] qemu live migration: bigger downtime recently 2021-01-22 15:06 ` aderumier @ 2021-01-22 18:55 ` aderumier 2021-01-23 8:38 ` aderumier 0 siblings, 1 reply; 6+ messages in thread From: aderumier @ 2021-01-22 18:55 UTC (permalink / raw) To: pve-devel after some debug, it seem that it's hanging on $stat = mon_cmd($vmid, "query-migrate"); result of info migrate after the end of a migration: # info migrate globals: store-global-state: on only-migratable: off send-configuration: on send-section-footer: on decompress-error-check: on clear-bitmap-shift: 18 Migration status: completed total time: 9671 ms downtime: 9595 ms setup: 74 ms transferred ram: 10445790 kbytes throughput: 8916.93 mbps remaining ram: 0 kbytes total ram: 12600392 kbytes duplicate: 544936 pages skipped: 0 pages normal: 2605162 pages normal bytes: 10420648 kbytes dirty sync count: 2 page size: 4 kbytes multifd bytes: 0 kbytes pages-per-second: 296540 cache size: 2147483648 bytes xbzrle transferred: 0 kbytes xbzrle pages: 0 pages xbzrle cache miss: 0 pages xbzrle cache miss rate: 0.00 xbzrle encoding rate: 0.00 xbzrle overflow: 0 Le vendredi 22 janvier 2021 à 16:06 +0100, aderumier@odiso.com a écrit : > I have tried to add a log to display the current status state of the > migration, > and It don't catch any "active" state, but "completed" directly. > > Here another sample with a bigger downtime of 14s (real downtime, I > have checked with a ping to be sure) > > > > 2021-01-22 16:02:53 starting migration of VM 391 to node 'kvm13' > (10.3.94.70) > 2021-01-22 16:02:53 starting VM 391 on remote node 'kvm13' > 2021-01-22 16:02:55 start remote tunnel > 2021-01-22 16:02:56 ssh tunnel ver 1 > 2021-01-22 16:02:56 starting online/live migration on > tcp:10.3.94.70:60000 > 2021-01-22 16:02:56 set migration_caps > 2021-01-22 16:02:56 migration speed limit: 8589934592 B/s > 2021-01-22 16:02:56 migration downtime limit: 100 ms > 2021-01-22 16:02:56 migration cachesize: 2147483648 B > 2021-01-22 16:02:56 set migration parameters > 2021-01-22 16:02:56 start migrate command to tcp:10.3.94.70:60000 > > > > 2021-01-22 16:03:11 status: completed ---> added log > 2021-01-22 16:03:11 migration speed: 1092.27 MB/s - downtime 14424 ms > 2021-01-22 16:03:11 migration status: completed > 2021-01-22 16:03:14 migration finished successfully (duration > 00:00:21) > TASK OK > > > > my $merr = $@; > $self->log('info', "migrate uri => $ruri failed: $merr") if > $merr; > > my $lstat = 0; > my $usleep = 1000000; > my $i = 0; > my $err_count = 0; > my $lastrem = undef; > my $downtimecounter = 0; > while (1) { > $i++; > my $avglstat = $lstat ? $lstat / $i : 0; > > usleep($usleep); > my $stat; > eval { > $stat = mon_cmd($vmid, "query-migrate"); > }; > if (my $err = $@) { > $err_count++; > warn "query migrate failed: $err\n"; > $self->log('info', "query migrate failed: $err"); > if ($err_count <= 5) { > usleep(1000000); > next; > } > die "too many query migrate failures - aborting\n"; > } > > $self->log('info', "status: $stat->{status}"); ---> added > log > > > Le vendredi 22 janvier 2021 à 15:34 +0100, aderumier@odiso.com a > écrit : > > Hi, > > > > I have notice recently bigger downtime on qemu live migration. > > (I'm not sure if it's after qemu update or qemu-server update) > > > > migration: type=insecure > > > > qemu-server 6.3-2 > > pve-qemu-kvm 5.1.0-7 > > > > (I'm not sure about the machine running qemu version) > > > > > > > > Here a sample: > > > > > > > > 2021-01-22 15:28:38 starting migration of VM 226 to node 'kvm13' > > (10.3.94.70) > > 2021-01-22 15:28:42 starting VM 226 on remote node 'kvm13' > > 2021-01-22 15:28:44 start remote tunnel > > 2021-01-22 15:28:45 ssh tunnel ver 1 > > 2021-01-22 15:28:45 starting online/live migration on > > tcp:10.3.94.70:60000 > > 2021-01-22 15:28:45 set migration_caps > > 2021-01-22 15:28:45 migration speed limit: 8589934592 B/s > > 2021-01-22 15:28:45 migration downtime limit: 100 ms > > 2021-01-22 15:28:45 migration cachesize: 268435456 B > > 2021-01-22 15:28:45 set migration parameters > > 2021-01-22 15:28:45 start migrate command to tcp:10.3.94.70:60000 > > 2021-01-22 15:28:47 migration speed: 1024.00 MB/s - downtime 2117 > > ms > > 2021-01-22 15:28:47 migration status: completed > > 2021-01-22 15:28:51 migration finished successfully (duration > > 00:00:13) > > TASK OK > > > > That's strange because I don't see the memory transfert loop logs > > > > > > > > Migrate back to original host is working > > > > 2021-01-22 15:29:34 starting migration of VM 226 to node 'kvm2' > > (::ffff:10.3.94.50) > > 2021-01-22 15:29:36 starting VM 226 on remote node 'kvm2' > > 2021-01-22 15:29:39 start remote tunnel > > 2021-01-22 15:29:40 ssh tunnel ver 1 > > 2021-01-22 15:29:40 starting online/live migration on > > tcp:[::ffff:10.3.94.50]:60000 > > 2021-01-22 15:29:40 set migration_caps > > 2021-01-22 15:29:40 migration speed limit: 8589934592 B/s > > 2021-01-22 15:29:40 migration downtime limit: 100 ms > > 2021-01-22 15:29:40 migration cachesize: 268435456 B > > 2021-01-22 15:29:40 set migration parameters > > 2021-01-22 15:29:40 start migrate command to > > tcp:[::ffff:10.3.94.50]:60000 > > 2021-01-22 15:29:41 migration status: active (transferred > > 396107554, > > remaining 1732018176), total 2165383168) > > 2021-01-22 15:29:41 migration xbzrle cachesize: 268435456 > > transferred > > 0 > > pages 0 cachemiss 0 overflow 0 > > 2021-01-22 15:29:42 migration status: active (transferred > > 973010921, > > remaining 1089216512), total 2165383168) > > 2021-01-22 15:29:42 migration xbzrle cachesize: 268435456 > > transferred > > 0 > > pages 0 cachemiss 0 overflow 0 > > 2021-01-22 15:29:43 migration status: active (transferred > > 1511925476, > > remaining 483463168), total 2165383168) > > 2021-01-22 15:29:43 migration xbzrle cachesize: 268435456 > > transferred > > 0 > > pages 0 cachemiss 0 overflow 0 > > 2021-01-22 15:29:44 migration speed: 512.00 MB/s - downtime 148 ms > > 2021-01-22 15:29:44 migration status: completed > > 2021-01-22 15:29:47 migration finished successfully (duration > > 00:00:13) > > TASK OK > > > > > > Then migrate it again like the first migration is working too > > > > > > 2021-01-22 15:31:07 starting migration of VM 226 to node 'kvm13' > > (10.3.94.70) > > 2021-01-22 15:31:10 starting VM 226 on remote node 'kvm13' > > 2021-01-22 15:31:12 start remote tunnel > > 2021-01-22 15:31:13 ssh tunnel ver 1 > > 2021-01-22 15:31:13 starting online/live migration on > > tcp:10.3.94.70:60000 > > 2021-01-22 15:31:13 set migration_caps > > 2021-01-22 15:31:13 migration speed limit: 8589934592 B/s > > 2021-01-22 15:31:13 migration downtime limit: 100 ms > > 2021-01-22 15:31:13 migration cachesize: 268435456 B > > 2021-01-22 15:31:13 set migration parameters > > 2021-01-22 15:31:13 start migrate command to tcp:10.3.94.70:60000 > > 2021-01-22 15:31:14 migration status: active (transferred > > 1092088188, > > remaining 944365568), total 2165383168) > > 2021-01-22 15:31:14 migration xbzrle cachesize: 268435456 > > transferred > > 0 > > pages 0 cachemiss 0 overflow 0 > > 2021-01-22 15:31:15 migration speed: 1024.00 MB/s - downtime 55 ms > > 2021-01-22 15:31:15 migration status: completed > > 2021-01-22 15:31:19 migration finished successfully (duration > > 00:00:12) > > TASK OK > > > > > > Any idea ? Maybe a specific qemu version bug ? > > > > > > > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] qemu live migration: bigger downtime recently 2021-01-22 18:55 ` aderumier @ 2021-01-23 8:38 ` aderumier 2021-01-25 8:47 ` Fabian Grünbichler 0 siblings, 1 reply; 6+ messages in thread From: aderumier @ 2021-01-23 8:38 UTC (permalink / raw) To: pve-devel about qemu version, theses vms was started around 6 november, after an update of the qemu package the 4 november. looking at proxmox repo, I think it should be 5.1.0-4 or -5. pve-qemu-kvm-dbg_5.1.0-4_amd64.deb 29-Oct-2020 17:28 75705544 pve-qemu-kvm-dbg_5.1.0-5_amd64.deb 04-Nov-2020 17:41 75737556 pve-qemu-kvm-dbg_5.1.0-6_amd64.deb 05-Nov-2020 18:08 75693264 Could it be a known bug introduced by new backups dirty-bitmap patches, and fixed later ? (I see a -6 version one day later) Le vendredi 22 janvier 2021 à 19:55 +0100, aderumier@odiso.com a écrit : > after some debug, it seem that it's hanging on > > $stat = mon_cmd($vmid, "query-migrate"); > > > > > result of info migrate after the end of a migration: > > # info migrate > globals: > store-global-state: on > only-migratable: off > send-configuration: on > send-section-footer: on > decompress-error-check: on > clear-bitmap-shift: 18 > Migration status: completed > total time: 9671 ms > downtime: 9595 ms > setup: 74 ms > transferred ram: 10445790 kbytes > throughput: 8916.93 mbps > remaining ram: 0 kbytes > total ram: 12600392 kbytes > duplicate: 544936 pages > skipped: 0 pages > normal: 2605162 pages > normal bytes: 10420648 kbytes > dirty sync count: 2 > page size: 4 kbytes > multifd bytes: 0 kbytes > pages-per-second: 296540 > cache size: 2147483648 bytes > xbzrle transferred: 0 kbytes > xbzrle pages: 0 pages > xbzrle cache miss: 0 pages > xbzrle cache miss rate: 0.00 > xbzrle encoding rate: 0.00 > xbzrle overflow: 0 > > > > > Le vendredi 22 janvier 2021 à 16:06 +0100, aderumier@odiso.com a > écrit : > > I have tried to add a log to display the current status state of > > the > > migration, > > and It don't catch any "active" state, but "completed" directly. > > > > Here another sample with a bigger downtime of 14s (real downtime, I > > have checked with a ping to be sure) > > > > > > > > 2021-01-22 16:02:53 starting migration of VM 391 to node 'kvm13' > > (10.3.94.70) > > 2021-01-22 16:02:53 starting VM 391 on remote node 'kvm13' > > 2021-01-22 16:02:55 start remote tunnel > > 2021-01-22 16:02:56 ssh tunnel ver 1 > > 2021-01-22 16:02:56 starting online/live migration on > > tcp:10.3.94.70:60000 > > 2021-01-22 16:02:56 set migration_caps > > 2021-01-22 16:02:56 migration speed limit: 8589934592 B/s > > 2021-01-22 16:02:56 migration downtime limit: 100 ms > > 2021-01-22 16:02:56 migration cachesize: 2147483648 B > > 2021-01-22 16:02:56 set migration parameters > > 2021-01-22 16:02:56 start migrate command to tcp:10.3.94.70:60000 > > > > > > > > 2021-01-22 16:03:11 status: completed ---> added log > > 2021-01-22 16:03:11 migration speed: 1092.27 MB/s - downtime 14424 > > ms > > 2021-01-22 16:03:11 migration status: completed > > 2021-01-22 16:03:14 migration finished successfully (duration > > 00:00:21) > > TASK OK > > > > > > > > my $merr = $@; > > $self->log('info', "migrate uri => $ruri failed: $merr") if > > $merr; > > > > my $lstat = 0; > > my $usleep = 1000000; > > my $i = 0; > > my $err_count = 0; > > my $lastrem = undef; > > my $downtimecounter = 0; > > while (1) { > > $i++; > > my $avglstat = $lstat ? $lstat / $i : 0; > > > > usleep($usleep); > > my $stat; > > eval { > > $stat = mon_cmd($vmid, "query-migrate"); > > }; > > if (my $err = $@) { > > $err_count++; > > warn "query migrate failed: $err\n"; > > $self->log('info', "query migrate failed: $err"); > > if ($err_count <= 5) { > > usleep(1000000); > > next; > > } > > die "too many query migrate failures - aborting\n"; > > } > > > > $self->log('info', "status: $stat->{status}"); ---> added > > log > > > > > > Le vendredi 22 janvier 2021 à 15:34 +0100, aderumier@odiso.com a > > écrit : > > > Hi, > > > > > > I have notice recently bigger downtime on qemu live migration. > > > (I'm not sure if it's after qemu update or qemu-server update) > > > > > > migration: type=insecure > > > > > > qemu-server 6.3-2 > > > pve-qemu-kvm 5.1.0-7 > > > > > > (I'm not sure about the machine running qemu version) > > > > > > > > > > > > Here a sample: > > > > > > > > > > > > 2021-01-22 15:28:38 starting migration of VM 226 to node 'kvm13' > > > (10.3.94.70) > > > 2021-01-22 15:28:42 starting VM 226 on remote node 'kvm13' > > > 2021-01-22 15:28:44 start remote tunnel > > > 2021-01-22 15:28:45 ssh tunnel ver 1 > > > 2021-01-22 15:28:45 starting online/live migration on > > > tcp:10.3.94.70:60000 > > > 2021-01-22 15:28:45 set migration_caps > > > 2021-01-22 15:28:45 migration speed limit: 8589934592 B/s > > > 2021-01-22 15:28:45 migration downtime limit: 100 ms > > > 2021-01-22 15:28:45 migration cachesize: 268435456 B > > > 2021-01-22 15:28:45 set migration parameters > > > 2021-01-22 15:28:45 start migrate command to tcp:10.3.94.70:60000 > > > 2021-01-22 15:28:47 migration speed: 1024.00 MB/s - downtime 2117 > > > ms > > > 2021-01-22 15:28:47 migration status: completed > > > 2021-01-22 15:28:51 migration finished successfully (duration > > > 00:00:13) > > > TASK OK > > > > > > That's strange because I don't see the memory transfert loop logs > > > > > > > > > > > > Migrate back to original host is working > > > > > > 2021-01-22 15:29:34 starting migration of VM 226 to node 'kvm2' > > > (::ffff:10.3.94.50) > > > 2021-01-22 15:29:36 starting VM 226 on remote node 'kvm2' > > > 2021-01-22 15:29:39 start remote tunnel > > > 2021-01-22 15:29:40 ssh tunnel ver 1 > > > 2021-01-22 15:29:40 starting online/live migration on > > > tcp:[::ffff:10.3.94.50]:60000 > > > 2021-01-22 15:29:40 set migration_caps > > > 2021-01-22 15:29:40 migration speed limit: 8589934592 B/s > > > 2021-01-22 15:29:40 migration downtime limit: 100 ms > > > 2021-01-22 15:29:40 migration cachesize: 268435456 B > > > 2021-01-22 15:29:40 set migration parameters > > > 2021-01-22 15:29:40 start migrate command to > > > tcp:[::ffff:10.3.94.50]:60000 > > > 2021-01-22 15:29:41 migration status: active (transferred > > > 396107554, > > > remaining 1732018176), total 2165383168) > > > 2021-01-22 15:29:41 migration xbzrle cachesize: 268435456 > > > transferred > > > 0 > > > pages 0 cachemiss 0 overflow 0 > > > 2021-01-22 15:29:42 migration status: active (transferred > > > 973010921, > > > remaining 1089216512), total 2165383168) > > > 2021-01-22 15:29:42 migration xbzrle cachesize: 268435456 > > > transferred > > > 0 > > > pages 0 cachemiss 0 overflow 0 > > > 2021-01-22 15:29:43 migration status: active (transferred > > > 1511925476, > > > remaining 483463168), total 2165383168) > > > 2021-01-22 15:29:43 migration xbzrle cachesize: 268435456 > > > transferred > > > 0 > > > pages 0 cachemiss 0 overflow 0 > > > 2021-01-22 15:29:44 migration speed: 512.00 MB/s - downtime 148 > > > ms > > > 2021-01-22 15:29:44 migration status: completed > > > 2021-01-22 15:29:47 migration finished successfully (duration > > > 00:00:13) > > > TASK OK > > > > > > > > > Then migrate it again like the first migration is working too > > > > > > > > > 2021-01-22 15:31:07 starting migration of VM 226 to node 'kvm13' > > > (10.3.94.70) > > > 2021-01-22 15:31:10 starting VM 226 on remote node 'kvm13' > > > 2021-01-22 15:31:12 start remote tunnel > > > 2021-01-22 15:31:13 ssh tunnel ver 1 > > > 2021-01-22 15:31:13 starting online/live migration on > > > tcp:10.3.94.70:60000 > > > 2021-01-22 15:31:13 set migration_caps > > > 2021-01-22 15:31:13 migration speed limit: 8589934592 B/s > > > 2021-01-22 15:31:13 migration downtime limit: 100 ms > > > 2021-01-22 15:31:13 migration cachesize: 268435456 B > > > 2021-01-22 15:31:13 set migration parameters > > > 2021-01-22 15:31:13 start migrate command to tcp:10.3.94.70:60000 > > > 2021-01-22 15:31:14 migration status: active (transferred > > > 1092088188, > > > remaining 944365568), total 2165383168) > > > 2021-01-22 15:31:14 migration xbzrle cachesize: 268435456 > > > transferred > > > 0 > > > pages 0 cachemiss 0 overflow 0 > > > 2021-01-22 15:31:15 migration speed: 1024.00 MB/s - downtime 55 > > > ms > > > 2021-01-22 15:31:15 migration status: completed > > > 2021-01-22 15:31:19 migration finished successfully (duration > > > 00:00:12) > > > TASK OK > > > > > > > > > Any idea ? Maybe a specific qemu version bug ? > > > > > > > > > > > > > > > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] qemu live migration: bigger downtime recently 2021-01-23 8:38 ` aderumier @ 2021-01-25 8:47 ` Fabian Grünbichler 2021-01-25 9:26 ` aderumier 0 siblings, 1 reply; 6+ messages in thread From: Fabian Grünbichler @ 2021-01-25 8:47 UTC (permalink / raw) To: pve-devel On January 23, 2021 9:38 am, aderumier@odiso.com wrote: > about qemu version, > > theses vms was started around 6 november, after an update of the qemu > package the 4 november. > > > looking at proxmox repo, I think it should be 5.1.0-4 or -5. > > > pve-qemu-kvm-dbg_5.1.0-4_amd64.deb 29-Oct-2020 17:28 > 75705544 > pve-qemu-kvm-dbg_5.1.0-5_amd64.deb 04-Nov-2020 17:41 > 75737556 > pve-qemu-kvm-dbg_5.1.0-6_amd64.deb 05-Nov-2020 18:08 > 75693264 > > > Could it be a known bug introduced by new backups dirty-bitmap patches, > and fixed later ? (I see a -6 version one day later) > pve-qemu-kvm (5.1.0-6) pve; urgency=medium * migration/block-dirty-bitmap: avoid telling QEMU that the bitmap migration is active longer than required -- Proxmox Support Team <support@proxmox.com> Thu, 05 Nov 2020 18:59:40 +0100 sound like that could be the case? ;) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [pve-devel] qemu live migration: bigger downtime recently 2021-01-25 8:47 ` Fabian Grünbichler @ 2021-01-25 9:26 ` aderumier 0 siblings, 0 replies; 6+ messages in thread From: aderumier @ 2021-01-25 9:26 UTC (permalink / raw) To: Proxmox VE development discussion > > pve-qemu-kvm (5.1.0-6) pve; urgency=medium > > * migration/block-dirty-bitmap: avoid telling QEMU that the bitmap > migration > is active longer than required > > -- Proxmox Support Team <support@proxmox.com> Thu, 05 Nov 2020 > 18:59:40 +0100 > > sound like that could be the case? ;) yes, I was not sure about this. So,I was just out of luck when I have upgraded ^_^ I have tried to change dirty-bitmaps=0 in set_migration_caps, but it don't fix it. So, I think I'm good to plan some migrations with downtime. Thanks for your response ! Alexandre ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-01-25 9:26 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-22 14:34 [pve-devel] qemu live migration: bigger downtime recently aderumier 2021-01-22 15:06 ` aderumier 2021-01-22 18:55 ` aderumier 2021-01-23 8:38 ` aderumier 2021-01-25 8:47 ` Fabian Grünbichler 2021-01-25 9:26 ` aderumier
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.Service provided by Proxmox Server Solutions GmbH | Privacy | Legal