From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH v2 qemu-server 2/2] remote-migration: add target-cpu param
Date: Thu, 27 Apr 2023 09:32:42 +0200 [thread overview]
Message-ID: <1682580098.xwye6zkp88.astroid@yuna.none> (raw)
In-Reply-To: <a278d421ea7e917e42f89a4de42ca4fb39913bce.camel@groupe-cyllene.com>
On April 27, 2023 7:50 am, DERUMIER, Alexandre wrote:
> Hi,
>
> Le mercredi 26 avril 2023 à 15:14 +0200, Fabian Grünbichler a écrit :
>> On April 25, 2023 6:52 pm, Alexandre Derumier wrote:
>> > This patch add support for remote migration when target
>> > cpu model is different.
>> >
>> > The target vm is restart after the migration
>>
>> so this effectively introduces a new "hybrid" migration mode ;) the
>> changes are a bit smaller than I expected (in part thanks to patch
>> #1),
>> which is good.
>>
>> there are semi-frequent requests for another variant (also applicable
>> to
>> containers) in the form of a two phase migration
>> - storage migrate
>> - stop guest
>> - incremental storage migrate
>> - start guest on target
>>
>
> But I'm not sure how to to an incremental storage migrate, without
> storage snapshot send|receiv. (so zfs && rbd could work).
>
> - Vm/ct is running
> - do a first snapshot + sync to target with zfs|rbd send|receive
> - stop the guest
> - do a second snapshot + incremental sync + sync to target with zfs|rbd
> send|receive
> - start the guest on remote
>
>
> (or maybe for vm, without snapshot, with a dirty bitmap ? But we need
> to be able to write the dirty map content to disk somewhere after vm
> stop, and reread it for the last increment )
theoretically, we could support such a mode for non-snapshot storages by
using bitmaps+block-mirror, yes. either with a target VM, or with
qemu-storage-daemon on the target node exposing the target volumes
> - vm is running
> - create a dirty-bitmap and start sync with qemu-block-storage
> - stop the vm && save the dirty bitmap
> - reread the dirtymap && do incremental sync (with the new qemu-daemon-
> storage or starting the vm paused ?
stop here could also just mean stop the guest OS, but leave the process
for the incremental sync, so it would not need persistent bitmap
support.
> And currently we don't support yet offline storage migration. (BTW,
> This is also breaking migration with unused disk).
> I don't known if we can send send|receiv transfert through the tunnel ?
> (I never tested it)
we do, but maybe you tested with RBD which doesn't support storage
migration yet? withing a cluster it doesn't need to, since it's a shared
storage, but between cluster we need to implement it (it's on my TODO
list and shouldn't be too hard since there is 'rbd export/import').
>> given that it might make sense to save-guard this implementation
>> here,
>> and maybe switch to a new "mode" parameter?
>>
>> online => switching CPU not allowed
>> offline or however-we-call-this-new-mode (or in the future, two-
>> phase-restart) => switching CPU allowed
>>
>
> Yes, I was thinking about that too.
> Maybe not "offline", because maybe we want to implement a real offline
> mode later.
> But simply "restart" ?
no, I meant moving the existing --online switch to a new mode parameter,
then we'd have "online" and "offline", and then add your new mode on top
"however-we-call-this-new-mode", and then we could in the future also
add "two-phase-restart" for the sync-twice mode I described :)
target-cpu would of course also be supported for the (existing) offline
mode, since it just needs to adapt the target-cpu in the config.
the main thing I'd want to avoid is somebody accidentally setting
"target-cpu", not knowing/noticing that that entails what amounts to a
reset of the VM as part of the migration..
there were a few things down below that might also be worthy of
discussion. I also wonder whether the two variants of "freeze FS" and
"suspend without state" are enough - that only ensures that no more I/O
happens so the volumes are bitwise identical, but shouldn't we also at
least have the option of doing a clean shutdown at that point so that
applications can serialize/flush their state properly and that gets
synced across as well? else this is the equivalent of cutting the power
cord, which might not be a good fit for all use cases ;)
>> >
>> > Signed-off-by: Alexandre Derumier <aderumier@odiso.com>
>> > ---
>> > PVE/API2/Qemu.pm | 18 ++++++++++++++++++
>> > PVE/CLI/qm.pm | 6 ++++++
>> > PVE/QemuMigrate.pm | 25 +++++++++++++++++++++++++
>> > 3 files changed, 49 insertions(+)
>> >
>> > diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>> > index 587bb22..6703c87 100644
>> > --- a/PVE/API2/Qemu.pm
>> > +++ b/PVE/API2/Qemu.pm
>> > @@ -4460,6 +4460,12 @@ __PACKAGE__->register_method({
>> > optional => 1,
>> > default => 0,
>> > },
>> > + 'target-cpu' => {
>> > + optional => 1,
>> > + description => "Target Emulated CPU model. For
>> > online migration, the storage is live migrate, but the memory
>> > migration is skipped and the target vm is restarted.",
>> > + type => 'string',
>> > + format => 'pve-vm-cpu-conf',
>> > + },
>> > 'target-storage' => get_standard_option('pve-
>> > targetstorage', {
>> > completion =>
>> > \&PVE::QemuServer::complete_migration_storage,
>> > optional => 0,
>> > @@ -4557,11 +4563,14 @@ __PACKAGE__->register_method({
>> > raise_param_exc({ 'target-bridge' => "failed to parse
>> > bridge map: $@" })
>> > if $@;
>> >
>> > + my $target_cpu = extract_param($param, 'target-cpu');
>>
>> this is okay
>>
>> > +
>> > die "remote migration requires explicit storage mapping!\n"
>> > if $storagemap->{identity};
>> >
>> > $param->{storagemap} = $storagemap;
>> > $param->{bridgemap} = $bridgemap;
>> > + $param->{targetcpu} = $target_cpu;
>>
>> but this is a bit confusing with the variable/hash key naming ;)
>>
>> > $param->{remote} = {
>> > conn => $conn_args, # re-use fingerprint for tunnel
>> > client => $api_client,
>> > @@ -5604,6 +5613,15 @@ __PACKAGE__->register_method({
>> > PVE::QemuServer::nbd_stop($state->{vmid});
>> > return;
>> > },
>> > + 'restart' => sub {
>> > + PVE::QemuServer::vm_stop(undef, $state->{vmid},
>> > 1, 1);
>> > + my $info = PVE::QemuServer::vm_start_nolock(
>> > + $state->{storecfg},
>> > + $state->{vmid},
>> > + $state->{conf},
>> > + );
>> > + return;
>> > + },
>> > 'resume' => sub {
>> > if
>> > (PVE::QemuServer::Helpers::vm_running_locally($state->{vmid})) {
>> > PVE::QemuServer::vm_resume($state->{vmid},
>> > 1, 1);
>> > diff --git a/PVE/CLI/qm.pm b/PVE/CLI/qm.pm
>> > index c3c2982..06c74c1 100755
>> > --- a/PVE/CLI/qm.pm
>> > +++ b/PVE/CLI/qm.pm
>> > @@ -189,6 +189,12 @@ __PACKAGE__->register_method({
>> > optional => 1,
>> > default => 0,
>> > },
>> > + 'target-cpu' => {
>> > + optional => 1,
>> > + description => "Target Emulated CPU model. For
>> > online migration, the storage is live migrate, but the memory
>> > migration is skipped and the target vm is restarted.",
>> > + type => 'string',
>> > + format => 'pve-vm-cpu-conf',
>> > + },
>> > 'target-storage' => get_standard_option('pve-
>> > targetstorage', {
>> > completion =>
>> > \&PVE::QemuServer::complete_migration_storage,
>> > optional => 0,
>> > diff --git a/PVE/QemuMigrate.pm b/PVE/QemuMigrate.pm
>> > index e182415..04f8053 100644
>> > --- a/PVE/QemuMigrate.pm
>> > +++ b/PVE/QemuMigrate.pm
>> > @@ -731,6 +731,11 @@ sub cleanup_bitmaps {
>> > sub live_migration {
>> > my ($self, $vmid, $migrate_uri, $spice_port) = @_;
>> >
>> > + if($self->{opts}->{targetcpu}){
>> > + $self->log('info', "target cpu is different - skip live
>> > migration.");
>> > + return;
>> > + }
>> > +
>> > my $conf = $self->{vmconf};
>> >
>> > $self->log('info', "starting online/live migration on
>> > $migrate_uri");
>> > @@ -995,6 +1000,7 @@ sub phase1_remote {
>> > my $remote_conf = PVE::QemuConfig->load_config($vmid);
>> > PVE::QemuConfig->update_volume_ids($remote_conf, $self-
>> > >{volume_map});
>> >
>> > + $remote_conf->{cpu} = $self->{opts}->{targetcpu};
>>
>> do we need permission checks here (or better, somewhere early on, for
>> doing this here)
>>
>> > my $bridges = map_bridges($remote_conf, $self->{opts}-
>> > >{bridgemap});
>> > for my $target (keys $bridges->%*) {
>> > for my $nic (keys $bridges->{$target}->%*) {
>> > @@ -1354,6 +1360,21 @@ sub phase2 {
>> > live_migration($self, $vmid, $migrate_uri, $spice_port);
>> >
>> > if ($self->{storage_migration}) {
>> > +
>> > + #freeze source vm io/s if target cpu is different (no
>> > livemigration)
>> > + if ($self->{opts}->{targetcpu}) {
>> > + my $agent_running = $self->{conf}->{agent} &&
>> > PVE::QemuServer::qga_check_running($vmid);
>> > + if ($agent_running) {
>> > + print "freeze filesystem\n";
>> > + eval { mon_cmd($vmid, "guest-fsfreeze-freeze"); };
>> > + die $@ if $@;
>>
>> die here
>>
>> > + } else {
>> > + print "suspend vm\n";
>> > + eval { PVE::QemuServer::vm_suspend($vmid, 1); };
>> > + warn $@ if $@;
>>
>> but warn here?
>>
>> I'd like some more rationale for these two variants, what are the
>> pros
>> and cons? should we make it configurable?
>> > + }
>> > + }
>> > +
>> > # finish block-job with block-job-cancel, to disconnect
>> > source VM from NBD
>> > # to avoid it trying to re-establish it. We are in blockjob
>> > ready state,
>> > # thus, this command changes to it to blockjob complete
>> > (see qapi docs)
>> > @@ -1608,6 +1629,10 @@ sub phase3_cleanup {
>> > # clear migrate lock
>> > if ($tunnel && $tunnel->{version} >= 2) {
>> > PVE::Tunnel::write_tunnel($tunnel, 10, "unlock");
>> > + if ($self->{opts}->{targetcpu}) {
>> > + $self->log('info', "target cpu is different - restart
>> > target vm.");
>> > + PVE::Tunnel::write_tunnel($tunnel, 10, 'restart');
>> > + }
>> >
>> > PVE::Tunnel::finish_tunnel($tunnel);
>> > } else {
>> > --
>> > 2.30.2
>> >
>> >
>> > _______________________________________________
>> > pve-devel mailing list
>> > pve-devel@lists.proxmox.com
>> > https://antiphishing.cetsi.fr/proxy/v3?i=Zk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0&r=bHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g&f=SlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=XRKU
>> >
>> >
>> >
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@lists.proxmox.com
>> https://antiphishing.cetsi.fr/proxy/v3?i=Zk92VEFKaGQ4Ums4cnZEUWMTpfHaXFQGRw1_CnOoOH0&r=bHA1dGV3NWJQVUloaWNFUZPm0fiiBviaiy_RDav2GQ1U4uy6lsDDv3uBszpvvWYQN5FqKqFD6WPYupfAUP1c9g&f=SlhDbE9uS2laS2JaZFpNWvmsxai1zlJP9llgnl5HIv-4jAji8Dh2BQawzxID5bzr6Uv-3EQd-eluQbsPfcUOTg&u=https%3A//lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel&k=XRKU
>>
>
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
next prev parent reply other threads:[~2023-04-27 7:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-25 16:52 [pve-devel] [PATCH v2 qemu-server 0/2] remote-migration: migration with different cpu Alexandre Derumier
2023-04-25 16:52 ` [pve-devel] [PATCH v2 qemu-server 1/2] migration: move livemigration code in a dedicated sub Alexandre Derumier
2023-04-25 16:52 ` [pve-devel] [PATCH v2 qemu-server 2/2] remote-migration: add target-cpu param Alexandre Derumier
2023-04-26 13:14 ` Fabian Grünbichler
2023-04-27 5:50 ` DERUMIER, Alexandre
2023-04-27 7:32 ` Fabian Grünbichler [this message]
2023-04-28 6:43 ` DERUMIER, Alexandre
2023-04-28 9:12 ` Fabian Grünbichler
2023-04-29 7:57 ` Thomas Lamprecht
2023-05-02 8:30 ` Fabian Grünbichler
2023-09-28 14:58 ` DERUMIER, Alexandre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1682580098.xwye6zkp88.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pve-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.