public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning
@ 2024-03-06 10:47 Hannes Duerr
  2024-03-06 11:37 ` Friedrich Weber
  2024-03-06 12:40 ` Fiona Ebner
  0 siblings, 2 replies; 7+ messages in thread
From: Hannes Duerr @ 2024-03-06 10:47 UTC (permalink / raw)
  To: pve-devel

When a template with disks on LVM is cloned to another node, the storage
is first activated, then cloned and deactivated again after cloning.

However, if clones of this template are now created in parellel to other
nodes, it can happen that one of the tasks can no longer deactivate the
logical volume because it is still in use.  The reason for this is that
we use a shared lock.
Since the failed deactivation does not necessarily have consequences, we
downgrade the error to a warning, which means that the clone tasks will
continue to be completed successfully.

Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
---
 PVE/API2/Qemu.pm | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index 69c5896..f1e88b8 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -48,6 +48,7 @@ use PVE::DataCenterConfig;
 use PVE::SSHInfo;
 use PVE::Replication;
 use PVE::StorageTunnel;
+use PVE::RESTEnvironment qw(log_warn);
 
 BEGIN {
     if (!$ENV{PVE_GENERATING_DOCS}) {
@@ -3820,7 +3821,13 @@ __PACKAGE__->register_method({
 
 		if ($target) {
 		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
-		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
+		    eval {
+			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
+		    };
+		    my $err = $@;
+		    if ($err) {
+			log_warn("$err\n");
+		    }
 		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);
 
 		    my $newconffile = PVE::QemuConfig->config_file($newid, $target);
-- 
2.39.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning
  2024-03-06 10:47 [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning Hannes Duerr
@ 2024-03-06 11:37 ` Friedrich Weber
  2024-03-06 12:31   ` Fiona Ebner
  2024-03-06 12:40 ` Fiona Ebner
  1 sibling, 1 reply; 7+ messages in thread
From: Friedrich Weber @ 2024-03-06 11:37 UTC (permalink / raw)
  To: Proxmox VE development discussion, Hannes Duerr

Thanks for tackling this! Can confirm this patch demotes the error to a
warning and lets the qmclone task succeed (with a warning). GUI shows
"Warnings: 1" and task log contains:

can't deactivate LV '/dev/foobar/vm-100-disk-0':   Logical volume
foobar/vm-100-disk-0 in use.
WARN: volume deactivation failed: foobar:vm-100-disk-0 at
/usr/share/perl5/PVE/Storage.pm line 1246.

Small nits in-line:

On 06/03/2024 11:47, Hannes Duerr wrote:
> When a template with disks on LVM is cloned to another node, the storage
> is first activated, then cloned and deactivated again after cloning.

s/storage is/volumes are/g

> 
> However, if clones of this template are now created in parellel to other

typo: parallel

> nodes, it can happen that one of the tasks can no longer deactivate the
> logical volume because it is still in use.  The reason for this is that
> we use a shared lock.
> Since the failed deactivation does not necessarily have consequences, we
> downgrade the error to a warning, which means that the clone tasks will
> continue to be completed successfully.
> 
> Signed-off-by: Hannes Duerr <h.duerr@proxmox.com>
> ---
>  PVE/API2/Qemu.pm | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index 69c5896..f1e88b8 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -48,6 +48,7 @@ use PVE::DataCenterConfig;
>  use PVE::SSHInfo;
>  use PVE::Replication;
>  use PVE::StorageTunnel;
> +use PVE::RESTEnvironment qw(log_warn);
>  
>  BEGIN {
>      if (!$ENV{PVE_GENERATING_DOCS}) {
> @@ -3820,7 +3821,13 @@ __PACKAGE__->register_method({
>  
>  		if ($target) {
>  		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
> -		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
> +		    eval {
> +			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
> +		    };
> +		    my $err = $@;
> +		    if ($err) {
> +			log_warn("$err\n");
> +		    }

I think the extra \n adds an unnecessary newline here, which looks a bit
weird in the task log (though I'm not sure why the `chomp` in `log_warn`
doesn't remove the newline).

While at it, I think the four lines can be shortened to

> log_warn($@) if $@;

Though that might be too terse -- someone with more Perl experience than
me should judge that :)

>  		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);
>  
>  		    my $newconffile = PVE::QemuConfig->config_file($newid, $target);




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning
  2024-03-06 11:37 ` Friedrich Weber
@ 2024-03-06 12:31   ` Fiona Ebner
  0 siblings, 0 replies; 7+ messages in thread
From: Fiona Ebner @ 2024-03-06 12:31 UTC (permalink / raw)
  To: Proxmox VE development discussion, Friedrich Weber, Hannes Duerr

Am 06.03.24 um 12:37 schrieb Friedrich Weber:
> Thanks for tackling this! Can confirm this patch demotes the error to a
> warning and lets the qmclone task succeed (with a warning). GUI shows
> "Warnings: 1" and task log contains:
> 
> can't deactivate LV '/dev/foobar/vm-100-disk-0':   Logical volume
> foobar/vm-100-disk-0 in use.
> WARN: volume deactivation failed: foobar:vm-100-disk-0 at
> /usr/share/perl5/PVE/Storage.pm line 1246.
> 

Sounds like there is a missing newline after the error message in
PVE/Storage.pm. That's why Perl prints the file/line info.


>> @@ -3820,7 +3821,13 @@ __PACKAGE__->register_method({
>>  
>>  		if ($target) {
>>  		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
>> -		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>> +		    eval {
>> +			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>> +		    };
>> +		    my $err = $@;
>> +		    if ($err) {
>> +			log_warn("$err\n");
>> +		    }
> 
> I think the extra \n adds an unnecessary newline here, which looks a bit
> weird in the task log (though I'm not sure why the `chomp` in `log_warn`
> doesn't remove the newline).
> 
> While at it, I think the four lines can be shortened to
> 
>> log_warn($@) if $@;
> 
> Though that might be too terse -- someone with more Perl experience than
> me should judge that :)
> 

It's fine if the error is only used for printing and this comes
immediately after the eval.

In cases, you do something else with the error it can still be
if (my $err = $@) {
}




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning
  2024-03-06 10:47 [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning Hannes Duerr
  2024-03-06 11:37 ` Friedrich Weber
@ 2024-03-06 12:40 ` Fiona Ebner
  2024-03-06 13:14   ` Friedrich Weber
  1 sibling, 1 reply; 7+ messages in thread
From: Fiona Ebner @ 2024-03-06 12:40 UTC (permalink / raw)
  To: Proxmox VE development discussion, Hannes Duerr

Am 06.03.24 um 11:47 schrieb Hannes Duerr:
> @@ -3820,7 +3821,13 @@ __PACKAGE__->register_method({
>  
>  		if ($target) {
>  		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
> -		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
> +		    eval {
> +			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
> +		    };
> +		    my $err = $@;
> +		    if ($err) {
> +			log_warn("$err\n");
> +		    }
>  		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);

We might also want to catch errors here. Otherwise, the whole clone
operation (which might've taken hours) can still fail just because of a
deactivation error. But when failing here, we shouldn't move the config
file (or the LV can get active on multiple nodes more easily). Can be a
separate patch or the same (since it's still about demoting deactivation
failure, would still fit logically).

>  
>  		    my $newconffile = PVE::QemuConfig->config_file($newid, $target);




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning
  2024-03-06 12:40 ` Fiona Ebner
@ 2024-03-06 13:14   ` Friedrich Weber
  2024-03-06 14:04     ` Fiona Ebner
  0 siblings, 1 reply; 7+ messages in thread
From: Friedrich Weber @ 2024-03-06 13:14 UTC (permalink / raw)
  To: Proxmox VE development discussion, Fiona Ebner, Hannes Duerr

On 06/03/2024 13:40, Fiona Ebner wrote:
> Am 06.03.24 um 11:47 schrieb Hannes Duerr:
>> @@ -3820,7 +3821,13 @@ __PACKAGE__->register_method({
>>  
>>  		if ($target) {
>>  		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
>> -		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>> +		    eval {
>> +			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>> +		    };
>> +		    my $err = $@;
>> +		    if ($err) {
>> +			log_warn("$err\n");
>> +		    }
>>  		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);
> 
> We might also want to catch errors here. Otherwise, the whole clone
> operation (which might've taken hours) can still fail just because of a
> deactivation error. But when failing here, we shouldn't move the config
> file (or the LV can get active on multiple nodes more easily).

I think succeeding but not moving the config file when deactivating
$newvollist fails sounds like it could lead to unexpected behavior.
Right now, when running `qm clone 101 [...] --target node2` on node1
succeeds, one can be sure there will be an VM 101 on node2. But if we
cannot deactivate $newvollist and thus don't move the config file, the
command succeeds but VM 101 instead exists on node1 (correct me if I'm
wrong), which may be confusing e.g. if the clone is automated.

To avoid that, I'd lean towards keeping the behavior of failing the task
if deactivating $newvollist fails. After all, at least in case of LVM
not being able to deactivate because the device is in use, we just
created $newvollist so hopefully nobody else should be accessing it.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning
  2024-03-06 13:14   ` Friedrich Weber
@ 2024-03-06 14:04     ` Fiona Ebner
  2024-03-06 14:19       ` Hannes Dürr
  0 siblings, 1 reply; 7+ messages in thread
From: Fiona Ebner @ 2024-03-06 14:04 UTC (permalink / raw)
  To: Friedrich Weber, Proxmox VE development discussion, Hannes Duerr

Am 06.03.24 um 14:14 schrieb Friedrich Weber:
> On 06/03/2024 13:40, Fiona Ebner wrote:
>> Am 06.03.24 um 11:47 schrieb Hannes Duerr:
>>> @@ -3820,7 +3821,13 @@ __PACKAGE__->register_method({
>>>  
>>>  		if ($target) {
>>>  		    # always deactivate volumes - avoid lvm LVs to be active on several nodes
>>> -		    PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>>> +		    eval {
>>> +			PVE::Storage::deactivate_volumes($storecfg, $vollist, $snapname) if !$running;
>>> +		    };
>>> +		    my $err = $@;
>>> +		    if ($err) {
>>> +			log_warn("$err\n");
>>> +		    }
>>>  		    PVE::Storage::deactivate_volumes($storecfg, $newvollist);
>>
>> We might also want to catch errors here. Otherwise, the whole clone
>> operation (which might've taken hours) can still fail just because of a
>> deactivation error. But when failing here, we shouldn't move the config
>> file (or the LV can get active on multiple nodes more easily).
> 
> I think succeeding but not moving the config file when deactivating
> $newvollist fails sounds like it could lead to unexpected behavior.
> Right now, when running `qm clone 101 [...] --target node2` on node1
> succeeds, one can be sure there will be an VM 101 on node2. But if we
> cannot deactivate $newvollist and thus don't move the config file, the
> command succeeds but VM 101 instead exists on node1 (correct me if I'm
> wrong), which may be confusing e.g. if the clone is automated.
> 

Yes, but the question is what is worse: Needing to re-do the clone or
having the VM config on the wrong node?

> To avoid that, I'd lean towards keeping the behavior of failing the task
> if deactivating $newvollist fails. After all, at least in case of LVM
> not being able to deactivate because the device is in use, we just
> created $newvollist so hopefully nobody else should be accessing it.

Fine by me. Yes, it's unlikely to fail. And we can still adapt later if
users ever complain about it.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning
  2024-03-06 14:04     ` Fiona Ebner
@ 2024-03-06 14:19       ` Hannes Dürr
  0 siblings, 0 replies; 7+ messages in thread
From: Hannes Dürr @ 2024-03-06 14:19 UTC (permalink / raw)
  To: Fiona Ebner, Friedrich Weber, Proxmox VE development discussion


On 3/6/24 15:04, Fiona Ebner wrote:
> Yes, but the question is what is worse: Needing to re-do the clone or
> having the VM config on the wrong node?
>
>> To avoid that, I'd lean towards keeping the behavior of failing the task
>> if deactivating $newvollist fails. After all, at least in case of LVM
>> not being able to deactivate because the device is in use, we just
>> created $newvollist so hopefully nobody else should be accessing it.
> Fine by me. Yes, it's unlikely to fail. And we can still adapt later if
> users ever complain about it.

I've sent a v2 
https://lists.proxmox.com/pipermail/pve-devel/2024-March/062115.html

Also, i agree with friedrich, but yes, it should be very unlikely to fail.





^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-06 14:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-06 10:47 [pve-devel] [PATCH qemu-server 1/1] fix 1734: clone VM: if deactivation fails demote error to warning Hannes Duerr
2024-03-06 11:37 ` Friedrich Weber
2024-03-06 12:31   ` Fiona Ebner
2024-03-06 12:40 ` Fiona Ebner
2024-03-06 13:14   ` Friedrich Weber
2024-03-06 14:04     ` Fiona Ebner
2024-03-06 14:19       ` Hannes Dürr

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal