From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id 2A39760BFB for ; Thu, 3 Feb 2022 10:31:54 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 165BD24357 for ; Thu, 3 Feb 2022 10:31:24 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 1DBCC2434D for ; Thu, 3 Feb 2022 10:31:22 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id E1C06431C2 for ; Thu, 3 Feb 2022 10:31:21 +0100 (CET) Message-ID: <3f5de144-086e-570f-da61-e05dd6d2e365@proxmox.com> Date: Thu, 3 Feb 2022 10:31:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US To: pve-devel@lists.proxmox.com, =?UTF-8?Q?Fabian_Gr=c3=bcnbichler?= References: <20220127140155.66141-1-f.ebner@proxmox.com> <20220127140155.66141-3-f.ebner@proxmox.com> <1643631853.hgjpywv6g4.astroid@nora.none> From: Fabian Ebner In-Reply-To: <1643631853.hgjpywv6g4.astroid@nora.none> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.262 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% KAM_ASCII_DIVIDERS 0.8 Spam that uses ascii formatting tricks KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [qemu.pm, proxmox.com] Subject: Re: [pve-devel] [PATCH qemu-server 2/4] api: clone: fork before locking X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 Feb 2022 09:31:54 -0000 Am 31.01.22 um 13:34 schrieb Fabian Grünbichler: > On January 27, 2022 3:01 pm, Fabian Ebner wrote: >> using the familiar early+repeated checks pattern from other API calls. >> Only intended functional changes are with regard to locking/forking. > > two questions: > - the FW config cloning happens inside the worker now, while it was > previously before forking the worker (LGTM, but might be called out > explicitly if intentional ;)) Honestly, I didn't think too much about it, so thanks for pointing that out! But thinking about it now, I also don't see an obvious issue with it and IMHO it feels more natural to be part of the worker since it takes the firewall config lock and the cleanup also happens inside the worker. > - there are some checks at the start of the endpoint (checking > storage/target), which are not repeated after the fork+lock - while > unlikely, our view of storage.cfg could change in-between (lock guest > config -> cfs_update). should those be moved in the check sub (or into > the check_storage_access_clone helper)? > Yes, for better consistency that should be done. Either way is fine with me. Should I send a v2 or are you going to do a follow-up? > rest of the series LGTM > >> >> For a full clone of a running VM without guest agent, this also fixes >> issuing vm_{resume,suspend} calls for drive mirror completion. >> Previously, those just timed out, because of not getting the lock: >> >>> create full clone of drive scsi0 (rbdkvm:vm-104-disk-0) >>> Formatting '/var/lib/vz/images/105/vm-105-disk-0.raw', fmt=raw >>> size=4294967296 preallocation=off >>> drive mirror is starting for drive-scsi0 >>> drive-scsi0: transferred 2.0 MiB of 4.0 GiB (0.05%) in 0s >>> drive-scsi0: transferred 635.0 MiB of 4.0 GiB (15.50%) in 1s >>> drive-scsi0: transferred 1.6 GiB of 4.0 GiB (40.50%) in 2s >>> drive-scsi0: transferred 3.6 GiB of 4.0 GiB (90.23%) in 3s >>> drive-scsi0: transferred 4.0 GiB of 4.0 GiB (100.00%) in 4s, ready >>> all 'mirror' jobs are ready >>> suspend vm >>> trying to acquire lock... >>> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout >>> drive-scsi0: Cancelling block job >>> drive-scsi0: Done. >>> resume vm >>> trying to acquire lock... >>> can't lock file '/var/lock/qemu-server/lock-104.conf' - got timeout >> >> Signed-off-by: Fabian Ebner >> --- >> >> Best viewed with -w. >> >> PVE/API2/Qemu.pm | 220 ++++++++++++++++++++++++----------------------- >> 1 file changed, 112 insertions(+), 108 deletions(-) >> >> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm >> index 6992f6f..38e08c8 100644 >> --- a/PVE/API2/Qemu.pm >> +++ b/PVE/API2/Qemu.pm >> @@ -3079,9 +3079,7 @@ __PACKAGE__->register_method({ >> >> my $running = PVE::QemuServer::check_running($vmid) || 0; >> >> - my $clonefn = sub { >> - # do all tests after lock but before forking worker - if possible >> - >> + my $load_and_check = sub { >> my $conf = PVE::QemuConfig->load_config($vmid); >> PVE::QemuConfig->check_lock($conf); >> >> @@ -3091,7 +3089,7 @@ __PACKAGE__->register_method({ >> die "snapshot '$snapname' does not exist\n" >> if $snapname && !defined( $conf->{snapshots}->{$snapname}); >> >> - my $full = extract_param($param, 'full') // !PVE::QemuConfig->is_template($conf); >> + my $full = $param->{full} // !PVE::QemuConfig->is_template($conf); >> >> die "parameter 'storage' not allowed for linked clones\n" >> if defined($storage) && !$full; >> @@ -3156,7 +3154,13 @@ __PACKAGE__->register_method({ >> } >> } >> >> - # auto generate a new uuid >> + return ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone); >> + }; >> + >> + my $clonefn = sub { >> + my ($conffile, $newconf, $oldconf, $vollist, $drives, $fullclone) = $load_and_check->(); >> + >> + # auto generate a new uuid >> my $smbios1 = PVE::QemuServer::parse_smbios1($newconf->{smbios1} || ''); >> $smbios1->{uuid} = PVE::QemuServer::generate_uuid(); >> $newconf->{smbios1} = PVE::QemuServer::print_smbios1($smbios1); >> @@ -3181,105 +3185,99 @@ __PACKAGE__->register_method({ >> # FIXME use PVE::QemuConfig->create_and_lock_config and adapt code >> PVE::Tools::file_set_contents($conffile, "# qmclone temporary file\nlock: clone\n"); >> >> - my $realcmd = sub { >> - my $upid = shift; >> - >> - my $newvollist = []; >> - my $jobs = {}; >> - >> - eval { >> - local $SIG{INT} = >> - local $SIG{TERM} = >> - local $SIG{QUIT} = >> - local $SIG{HUP} = sub { die "interrupted by signal\n"; }; >> - >> - PVE::Storage::activate_volumes($storecfg, $vollist, $snapname); >> - >> - my $bwlimit = extract_param($param, 'bwlimit'); >> - >> - my $total_jobs = scalar(keys %{$drives}); >> - my $i = 1; >> - >> - foreach my $opt (sort keys %$drives) { >> - my $drive = $drives->{$opt}; >> - my $skipcomplete = ($total_jobs != $i); # finish after last drive >> - my $completion = $skipcomplete ? 'skip' : 'complete'; >> - >> - my $src_sid = PVE::Storage::parse_volume_id($drive->{file}); >> - my $storage_list = [ $src_sid ]; >> - push @$storage_list, $storage if defined($storage); >> - my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit); >> - >> - my $newdrive = PVE::QemuServer::clone_disk( >> - $storecfg, >> - $vmid, >> - $running, >> - $opt, >> - $drive, >> - $snapname, >> - $newid, >> - $storage, >> - $format, >> - $fullclone->{$opt}, >> - $newvollist, >> - $jobs, >> - $completion, >> - $oldconf->{agent}, >> - $clonelimit, >> - $oldconf >> - ); >> - >> - $newconf->{$opt} = PVE::QemuServer::print_drive($newdrive); >> - >> - PVE::QemuConfig->write_config($newid, $newconf); >> - $i++; >> - } >> - >> - delete $newconf->{lock}; >> - >> - # do not write pending changes >> - if (my @changes = keys %{$newconf->{pending}}) { >> - my $pending = join(',', @changes); >> - warn "found pending changes for '$pending', discarding for clone\n"; >> - delete $newconf->{pending}; >> - } >> - >> - PVE::QemuConfig->write_config($newid, $newconf); >> - >> - if ($target) { >> - # always deactivate volumes - avoid lvm LVs to be active on several nodes >> - PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running; >> - PVE::Storage::deactivate_volumes($storecfg, $newvollist); >> - >> - my $newconffile = PVE::QemuConfig->config_file($newid, $target); >> - die "Failed to move config to node '$target' - rename failed: $!\n" >> - if !rename($conffile, $newconffile); >> - } >> - >> - PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool; >> - }; >> - if (my $err = $@) { >> - eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) }; >> - sleep 1; # some storage like rbd need to wait before release volume - really? >> - >> - foreach my $volid (@$newvollist) { >> - eval { PVE::Storage::vdisk_free($storecfg, $volid); }; >> - warn $@ if $@; >> - } >> - >> - PVE::Firewall::remove_vmfw_conf($newid); >> - >> - unlink $conffile; # avoid races -> last thing before die >> - >> - die "clone failed: $err"; >> - } >> - >> - return; >> - }; >> - >> PVE::Firewall::clone_vmfw_conf($vmid, $newid); >> >> - return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $realcmd); >> + my $newvollist = []; >> + my $jobs = {}; >> + >> + eval { >> + local $SIG{INT} = >> + local $SIG{TERM} = >> + local $SIG{QUIT} = >> + local $SIG{HUP} = sub { die "interrupted by signal\n"; }; >> + >> + PVE::Storage::activate_volumes($storecfg, $vollist, $snapname); >> + >> + my $bwlimit = extract_param($param, 'bwlimit'); >> + >> + my $total_jobs = scalar(keys %{$drives}); >> + my $i = 1; >> + >> + foreach my $opt (sort keys %$drives) { >> + my $drive = $drives->{$opt}; >> + my $skipcomplete = ($total_jobs != $i); # finish after last drive >> + my $completion = $skipcomplete ? 'skip' : 'complete'; >> + >> + my $src_sid = PVE::Storage::parse_volume_id($drive->{file}); >> + my $storage_list = [ $src_sid ]; >> + push @$storage_list, $storage if defined($storage); >> + my $clonelimit = PVE::Storage::get_bandwidth_limit('clone', $storage_list, $bwlimit); >> + >> + my $newdrive = PVE::QemuServer::clone_disk( >> + $storecfg, >> + $vmid, >> + $running, >> + $opt, >> + $drive, >> + $snapname, >> + $newid, >> + $storage, >> + $format, >> + $fullclone->{$opt}, >> + $newvollist, >> + $jobs, >> + $completion, >> + $oldconf->{agent}, >> + $clonelimit, >> + $oldconf >> + ); >> + >> + $newconf->{$opt} = PVE::QemuServer::print_drive($newdrive); >> + >> + PVE::QemuConfig->write_config($newid, $newconf); >> + $i++; >> + } >> + >> + delete $newconf->{lock}; >> + >> + # do not write pending changes >> + if (my @changes = keys %{$newconf->{pending}}) { >> + my $pending = join(',', @changes); >> + warn "found pending changes for '$pending', discarding for clone\n"; >> + delete $newconf->{pending}; >> + } >> + >> + PVE::QemuConfig->write_config($newid, $newconf); >> + >> + if ($target) { >> + # always deactivate volumes - avoid lvm LVs to be active on several nodes >> + PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running; >> + PVE::Storage::deactivate_volumes($storecfg, $newvollist); >> + >> + my $newconffile = PVE::QemuConfig->config_file($newid, $target); >> + die "Failed to move config to node '$target' - rename failed: $!\n" >> + if !rename($conffile, $newconffile); >> + } >> + >> + PVE::AccessControl::add_vm_to_pool($newid, $pool) if $pool; >> + }; >> + if (my $err = $@) { >> + eval { PVE::QemuServer::qemu_blockjobs_cancel($vmid, $jobs) }; >> + sleep 1; # some storage like rbd need to wait before release volume - really? >> + >> + foreach my $volid (@$newvollist) { >> + eval { PVE::Storage::vdisk_free($storecfg, $volid); }; >> + warn $@ if $@; >> + } >> + >> + PVE::Firewall::remove_vmfw_conf($newid); >> + >> + unlink $conffile; # avoid races -> last thing before die >> + >> + die "clone failed: $err"; >> + } >> + >> + return; >> }; >> >> # Aquire exclusive lock lock for $newid >> @@ -3287,12 +3285,18 @@ __PACKAGE__->register_method({ >> return PVE::QemuConfig->lock_config_full($newid, 1, $clonefn); >> }; >> >> - # exclusive lock if VM is running - else shared lock is enough; >> - if ($running) { >> - return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm); >> - } else { >> - return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm); >> - } >> + my $lock_source_vm = sub { >> + # exclusive lock if VM is running - else shared lock is enough; >> + if ($running) { >> + return PVE::QemuConfig->lock_config_full($vmid, 1, $lock_target_vm); >> + } else { >> + return PVE::QemuConfig->lock_config_shared($vmid, 1, $lock_target_vm); >> + } >> + }; >> + >> + $load_and_check->(); # early checks before forking/locking >> + >> + return $rpcenv->fork_worker('qmclone', $vmid, $authuser, $lock_source_vm); >> }}); >> >> __PACKAGE__->register_method({ >> -- >> 2.30.2 >> >> >> >> _______________________________________________ >> pve-devel mailing list >> pve-devel@lists.proxmox.com >> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel >> >> >> > > > _______________________________________________ > pve-devel mailing list > pve-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel > >