From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id CFD5B69CAB; Wed, 3 Mar 2021 10:58:04 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 94C04349D9; Wed, 3 Mar 2021 10:57:32 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 179873470F; Wed, 3 Mar 2021 10:57:25 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id D8533446A1; Wed, 3 Mar 2021 10:57:24 +0100 (CET) From: Stefan Reiter To: pve-devel@lists.proxmox.com, pbs-devel@lists.proxmox.com Date: Wed, 3 Mar 2021 10:56:09 +0100 Message-Id: <20210303095612.7475-9-s.reiter@proxmox.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210303095612.7475-1-s.reiter@proxmox.com> References: <20210303095612.7475-1-s.reiter@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.025 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [qemuserver.pm, qemu.pm] Subject: [pbs-devel] [PATCH v2 qemu-server 08/11] enable live-restore for PBS X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Mar 2021 09:58:04 -0000 Enables live-restore functionality using the 'alloc-track' QEMU driver. This allows starting a VM immediately when restoring from a PBS snapshot. The snapshot is mounted into the VM, so it can boot from that, while guest reads and a 'block-stream' job handle the restore in the background. If an error occurs, the VM is deleted and all data written during the restore is lost. The VM remains locked during the restore, which automatically prohibits any modifications to the config while restoring. Some modifications might potentially be safe, however, this is experimental enough that I believe this would cause more bad stuff(tm) than actually satisfy any use cases. Pool handling is slightly adjusted so the VM can be added to the pool before the restore starts. Signed-off-by: Stefan Reiter --- Looks better with -w v2: * move pool processing of vma backups to QemuServer.pm too, to keep it in one file at least PVE/API2/Qemu.pm | 14 ++++- PVE/QemuServer.pm | 150 ++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 138 insertions(+), 26 deletions(-) diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm index feb9ea8..28607f4 100644 --- a/PVE/API2/Qemu.pm +++ b/PVE/API2/Qemu.pm @@ -490,6 +490,12 @@ __PACKAGE__->register_method({ description => "Assign a unique random ethernet address.", requires => 'archive', }, + 'live-restore' => { + optional => 1, + type => 'boolean', + description => "Start the VM immediately from the backup and restore in background. PBS only.", + requires => 'archive', + }, pool => { optional => 1, type => 'string', format => 'pve-poolid', @@ -531,6 +537,10 @@ __PACKAGE__->register_method({ my $start_after_create = extract_param($param, 'start'); my $storage = extract_param($param, 'storage'); my $unique = extract_param($param, 'unique'); + my $live_restore = extract_param($param, 'live-restore'); + + raise_param_exc({ 'start' => "cannot specify 'start' with 'live-restore'" }) + if $start_after_create && $live_restore; if (defined(my $ssh_keys = $param->{sshkeys})) { $ssh_keys = URI::Escape::uri_unescape($ssh_keys); @@ -613,8 +623,10 @@ __PACKAGE__->register_method({ pool => $pool, unique => $unique, bwlimit => $bwlimit, + live => $live_restore, }; if ($archive->{type} eq 'file' || $archive->{type} eq 'pipe') { + die "live-restore is only compatible with PBS\n" if $live_restore; PVE::QemuServer::restore_file_archive($archive->{path} // '-', $vmid, $authuser, $restore_options); } elsif ($archive->{type} eq 'pbs') { PVE::QemuServer::restore_proxmox_backup_archive($archive->{volid}, $vmid, $authuser, $restore_options); @@ -628,8 +640,6 @@ __PACKAGE__->register_method({ eval { PVE::QemuServer::template_create($vmid, $restored_conf) }; warn $@ if $@; } - - PVE::AccessControl::add_vm_to_pool($vmid, $pool) if $pool; }; # ensure no old replication state are exists diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm index 22edc2a..d4ee8ec 100644 --- a/PVE/QemuServer.pm +++ b/PVE/QemuServer.pm @@ -6128,7 +6128,7 @@ sub restore_proxmox_backup_archive { my $repo = PVE::PBSClient::get_repository($scfg); - # This is only used for `pbs-restore`! + # This is only used for `pbs-restore` and the QEMU PBS driver (live-restore) my $password = PVE::Storage::PBSPlugin::pbs_get_password($scfg, $storeid); local $ENV{PBS_PASSWORD} = $password; local $ENV{PBS_FINGERPRINT} = $fingerprint if defined($fingerprint); @@ -6225,34 +6225,35 @@ sub restore_proxmox_backup_archive { # allocate volumes my $map = $restore_allocate_devices->($storecfg, $virtdev_hash, $vmid); - foreach my $virtdev (sort keys %$virtdev_hash) { - my $d = $virtdev_hash->{$virtdev}; - next if $d->{is_cloudinit}; # no need to restore cloudinit + if (!$options->{live}) { + foreach my $virtdev (sort keys %$virtdev_hash) { + my $d = $virtdev_hash->{$virtdev}; + next if $d->{is_cloudinit}; # no need to restore cloudinit - my $volid = $d->{volid}; + my $volid = $d->{volid}; - my $path = PVE::Storage::path($storecfg, $volid); + my $path = PVE::Storage::path($storecfg, $volid); - # This is the ONLY user of the PBS_ env vars set on top of this function! - my $pbs_restore_cmd = [ - '/usr/bin/pbs-restore', - '--repository', $repo, - $pbs_backup_name, - "$d->{devname}.img.fidx", - $path, - '--verbose', - ]; + my $pbs_restore_cmd = [ + '/usr/bin/pbs-restore', + '--repository', $repo, + $pbs_backup_name, + "$d->{devname}.img.fidx", + $path, + '--verbose', + ]; - push @$pbs_restore_cmd, '--format', $d->{format} if $d->{format}; - push @$pbs_restore_cmd, '--keyfile', $keyfile if -e $keyfile; + push @$pbs_restore_cmd, '--format', $d->{format} if $d->{format}; + push @$pbs_restore_cmd, '--keyfile', $keyfile if -e $keyfile; - if (PVE::Storage::volume_has_feature($storecfg, 'sparseinit', $volid)) { - push @$pbs_restore_cmd, '--skip-zero'; + if (PVE::Storage::volume_has_feature($storecfg, 'sparseinit', $volid)) { + push @$pbs_restore_cmd, '--skip-zero'; + } + + my $dbg_cmdstring = PVE::Tools::cmd2string($pbs_restore_cmd); + print "restore proxmox backup image: $dbg_cmdstring\n"; + run_command($pbs_restore_cmd); } - - my $dbg_cmdstring = PVE::Tools::cmd2string($pbs_restore_cmd); - print "restore proxmox backup image: $dbg_cmdstring\n"; - run_command($pbs_restore_cmd); } $fh->seek(0, 0) || die "seek failed - $!\n"; @@ -6269,7 +6270,9 @@ sub restore_proxmox_backup_archive { }; my $err = $@; - $restore_deactivate_volumes->($storecfg, $devinfo); + if ($err || !$options->{live}) { + $restore_deactivate_volumes->($storecfg, $devinfo); + } rmtree $tmpdir; @@ -6286,6 +6289,103 @@ sub restore_proxmox_backup_archive { eval { rescan($vmid, 1); }; warn $@ if $@; + + PVE::AccessControl::add_vm_to_pool($vmid, $options->{pool}) if $options->{pool}; + + if ($options->{live}) { + eval { + # enable interrupts + local $SIG{INT} = + local $SIG{TERM} = + local $SIG{QUIT} = + local $SIG{HUP} = + local $SIG{PIPE} = sub { die "interrupted by signal\n"; }; + + my $conf = PVE::QemuConfig->load_config($vmid); + die "cannot do live-restore for template\n" + if PVE::QemuConfig->is_template($conf); + + pbs_live_restore($vmid, $conf, $storecfg, $devinfo, $repo, $keyfile, $pbs_backup_name); + }; + + $err = $@; + if ($err) { + warn "Detroying live-restore VM, all temporary data will be lost!\n"; + $restore_deactivate_volumes->($storecfg, $devinfo); + $restore_destroy_volumes->($storecfg, $devinfo); + unlink $conffile; + die $err; + } + } +} + +sub pbs_live_restore { + my ($vmid, $conf, $storecfg, $restored_disks, $repo, $keyfile, $snap) = @_; + + print "Starting VM for live-restore\n"; + + my $pbs_backing = {}; + foreach my $ds (keys %$restored_disks) { + $ds =~ m/^drive-(.*)$/; + $pbs_backing->{$1} = { + repository => $repo, + snapshot => $snap, + archive => "$ds.img.fidx", + }; + $pbs_backing->{$1}->{keyfile} = $keyfile if -e $keyfile; + } + + eval { + # make sure HA doesn't interrupt our restore by stopping the VM + if (PVE::HA::Config::vm_is_ha_managed($vmid)) { + my $cmd = ['ha-manager', 'set', "vm:$vmid", '--state', 'started']; + PVE::Tools::run_command($cmd); + } + + # start VM with backing chain pointing to PBS backup, environment vars + # for PBS driver in QEMU (PBS_PASSWORD and PBS_FINGERPRINT) are already + # set by our caller + PVE::QemuServer::vm_start_nolock( + $storecfg, + $vmid, + $conf, + { + paused => 1, + 'pbs-backing' => $pbs_backing, + }, + {}, + ); + + # begin streaming, i.e. data copy from PBS to target disk for every vol, + # this will effectively collapse the backing image chain consisting of + # [target <- alloc-track -> PBS snapshot] to just [target] (alloc-track + # removes itself once all backing images vanish with 'auto-remove=on') + my $jobs = {}; + foreach my $ds (keys %$restored_disks) { + my $job_id = "restore-$ds"; + mon_cmd($vmid, 'block-stream', + 'job-id' => $job_id, + device => "$ds", + ); + $jobs->{$job_id} = {}; + } + + mon_cmd($vmid, 'cont'); + qemu_drive_mirror_monitor($vmid, undef, $jobs, 'auto', 0, 'stream'); + + # all jobs finished, remove blockdevs now to disconnect from PBS + foreach my $ds (keys %$restored_disks) { + mon_cmd($vmid, 'blockdev-del', 'node-name' => "$ds-pbs"); + } + }; + + my $err = $@; + + if ($err) { + warn "An error occured during live-restore: $err\n"; + _do_vm_stop($storecfg, $vmid, 1, 1, 10, 0, 1); + die "live-restore failed\n"; + } } sub restore_vma_archive { @@ -6498,6 +6598,8 @@ sub restore_vma_archive { eval { rescan($vmid, 1); }; warn $@ if $@; + + PVE::AccessControl::add_vm_to_pool($vmid, $opts->{pool}) if $opts->{pool}; } sub restore_tar_archive { -- 2.20.1