public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before
@ 2021-01-20 12:32 Dominik Csapak
  2021-01-20 14:15 ` Fabian Ebner
  2021-01-26 17:43 ` [pve-devel] applied: " Thomas Lamprecht
  0 siblings, 2 replies; 5+ messages in thread
From: Dominik Csapak @ 2021-01-20 12:32 UTC (permalink / raw)
  To: pve-devel

by checking if the vm is paused at the beginning and skipping the resume
now we also skip the qga freeze/thaw (which cannot work if the vm is
paused)

moved the 'vm_is_paused' sub from the api to PVE/QemuServer.pm so it is
available everywhere we need it.

since a suspend backup would pause the vm anyway, we can skip that step
also

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes to v2:
* actually include code changes (git add is not optional), sry for the noise

 PVE/API2/Qemu.pm         | 14 ++------------
 PVE/QemuServer.pm        | 10 ++++++++++
 PVE/VZDump/QemuServer.pm | 11 +++++++++--
 3 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
index e2d2d67..623c998 100644
--- a/PVE/API2/Qemu.pm
+++ b/PVE/API2/Qemu.pm
@@ -2402,16 +2402,6 @@ __PACKAGE__->register_method({
 	return $rpcenv->fork_worker('qmreset', $vmid, $authuser, $realcmd);
     }});
 
-my sub vm_is_paused {
-    my ($vmid) = @_;
-    my $qmpstatus = eval {
-	PVE::QemuConfig::assert_config_exists_on_node($vmid);
-	mon_cmd($vmid, "query-status");
-    };
-    warn "$@\n" if $@;
-    return $qmpstatus && $qmpstatus->{status} eq "paused";
-}
-
 __PACKAGE__->register_method({
     name => 'vm_shutdown',
     path => '{vmid}/status/shutdown',
@@ -2480,7 +2470,7 @@ __PACKAGE__->register_method({
 	#
 	# checking the qmp status here to get feedback to the gui/cli/api
 	# and the status query should not take too long
-	if (vm_is_paused($vmid)) {
+	if (PVE::QemuServer::vm_is_paused($vmid)) {
 	    if ($param->{forceStop}) {
 		warn "VM is paused - stop instead of shutdown\n";
 		$shutdown = 0;
@@ -2556,7 +2546,7 @@ __PACKAGE__->register_method({
 	my $node = extract_param($param, 'node');
 	my $vmid = extract_param($param, 'vmid');
 
-	die "VM is paused - cannot shutdown\n" if vm_is_paused($vmid);
+	die "VM is paused - cannot shutdown\n" if PVE::QemuServer::vm_is_paused($vmid);
 
 	die "VM $vmid not running\n" if !PVE::QemuServer::check_running($vmid);
 
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 54278e5..d7ee05f 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -7382,4 +7382,14 @@ sub complete_migration_storage {
     return $res;
 }
 
+sub vm_is_paused {
+    my ($vmid) = @_;
+    my $qmpstatus = eval {
+	PVE::QemuConfig::assert_config_exists_on_node($vmid);
+	mon_cmd($vmid, "query-status");
+    };
+    warn "$@\n" if $@;
+    return $qmpstatus && $qmpstatus->{status} eq "paused";
+}
+
 1;
diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
index b322701..b5e74d3 100644
--- a/PVE/VZDump/QemuServer.pm
+++ b/PVE/VZDump/QemuServer.pm
@@ -62,8 +62,11 @@ sub prepare {
 	if defined($conf->{name});
 
     $self->{vm_was_running} = 1;
+    $self->{vm_was_paused} = 0;
     if (!PVE::QemuServer::check_running($vmid)) {
 	$self->{vm_was_running} = 0;
+    } elsif (PVE::QemuServer::vm_is_paused($vmid)) {
+	$self->{vm_was_paused} = 1;
     }
 
     $task->{hostname} = $conf->{name};
@@ -176,12 +179,16 @@ sub start_vm {
 sub suspend_vm {
     my ($self, $task, $vmid) = @_;
 
+    return if $self->{vm_was_paused};
+
     $self->cmd ("qm suspend $vmid --skiplock");
 }
 
 sub resume_vm {
     my ($self, $task, $vmid) = @_;
 
+    return if $self->{vm_was_paused};
+
     $self->cmd ("qm resume $vmid --skiplock");
 }
 
@@ -794,7 +801,7 @@ sub _get_task_devlist {
 
 sub qga_fs_freeze {
     my ($self, $task, $vmid) = @_;
-    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 'stop' || !$self->{vm_was_running};
+    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 'stop' || !$self->{vm_was_running} || $self->{vm_was_paused};
 
     if (!PVE::QemuServer::qga_check_running($vmid, 1)) {
 	$self->loginfo("skipping guest-agent 'fs-freeze', agent configured but not running?");
@@ -872,7 +879,7 @@ sub register_qmeventd_handle {
 sub resume_vm_after_job_start {
     my ($self, $task, $vmid) = @_;
 
-    return if !$self->{vm_was_running};
+    return if !$self->{vm_was_running} || $self->{vm_was_paused};
 
     if (my $stoptime = $task->{vmstoptime}) {
 	my $delay = time() - $task->{vmstoptime};
-- 
2.20.1





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before
  2021-01-20 12:32 [pve-devel] [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before Dominik Csapak
@ 2021-01-20 14:15 ` Fabian Ebner
  2021-01-21  9:26   ` Dominik Csapak
  2021-01-26 17:43 ` [pve-devel] applied: " Thomas Lamprecht
  1 sibling, 1 reply; 5+ messages in thread
From: Fabian Ebner @ 2021-01-20 14:15 UTC (permalink / raw)
  To: pve-devel, d.csapak

One minor thing inline, and that likely needs a fix elsewhere if we even 
want to bother. Other than that:

Tested-by: Fabian Ebner <f.ebner@proxmox.com>

On 20.01.21 13:32, Dominik Csapak wrote:
> by checking if the vm is paused at the beginning and skipping the resume
> now we also skip the qga freeze/thaw (which cannot work if the vm is
> paused)
> 
> moved the 'vm_is_paused' sub from the api to PVE/QemuServer.pm so it is
> available everywhere we need it.
> 
> since a suspend backup would pause the vm anyway, we can skip that step
> also
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes to v2:
> * actually include code changes (git add is not optional), sry for the noise
> 
>   PVE/API2/Qemu.pm         | 14 ++------------
>   PVE/QemuServer.pm        | 10 ++++++++++
>   PVE/VZDump/QemuServer.pm | 11 +++++++++--
>   3 files changed, 21 insertions(+), 14 deletions(-)
> 
> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
> index e2d2d67..623c998 100644
> --- a/PVE/API2/Qemu.pm
> +++ b/PVE/API2/Qemu.pm
> @@ -2402,16 +2402,6 @@ __PACKAGE__->register_method({
>   	return $rpcenv->fork_worker('qmreset', $vmid, $authuser, $realcmd);
>       }});
>   
> -my sub vm_is_paused {
> -    my ($vmid) = @_;
> -    my $qmpstatus = eval {
> -	PVE::QemuConfig::assert_config_exists_on_node($vmid);
> -	mon_cmd($vmid, "query-status");
> -    };
> -    warn "$@\n" if $@;
> -    return $qmpstatus && $qmpstatus->{status} eq "paused";
> -}
> -
>   __PACKAGE__->register_method({
>       name => 'vm_shutdown',
>       path => '{vmid}/status/shutdown',
> @@ -2480,7 +2470,7 @@ __PACKAGE__->register_method({
>   	#
>   	# checking the qmp status here to get feedback to the gui/cli/api
>   	# and the status query should not take too long
> -	if (vm_is_paused($vmid)) {
> +	if (PVE::QemuServer::vm_is_paused($vmid)) {
>   	    if ($param->{forceStop}) {
>   		warn "VM is paused - stop instead of shutdown\n";
>   		$shutdown = 0;
> @@ -2556,7 +2546,7 @@ __PACKAGE__->register_method({
>   	my $node = extract_param($param, 'node');
>   	my $vmid = extract_param($param, 'vmid');
>   
> -	die "VM is paused - cannot shutdown\n" if vm_is_paused($vmid);
> +	die "VM is paused - cannot shutdown\n" if PVE::QemuServer::vm_is_paused($vmid);
>   
>   	die "VM $vmid not running\n" if !PVE::QemuServer::check_running($vmid);
>   
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 54278e5..d7ee05f 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -7382,4 +7382,14 @@ sub complete_migration_storage {
>       return $res;
>   }
>   
> +sub vm_is_paused {
> +    my ($vmid) = @_;
> +    my $qmpstatus = eval {
> +	PVE::QemuConfig::assert_config_exists_on_node($vmid);
> +	mon_cmd($vmid, "query-status");
> +    };
> +    warn "$@\n" if $@;
> +    return $qmpstatus && $qmpstatus->{status} eq "paused";
> +}
> +
>   1;
> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
> index b322701..b5e74d3 100644
> --- a/PVE/VZDump/QemuServer.pm
> +++ b/PVE/VZDump/QemuServer.pm
> @@ -62,8 +62,11 @@ sub prepare {
>   	if defined($conf->{name});
>   
>       $self->{vm_was_running} = 1;
> +    $self->{vm_was_paused} = 0;
>       if (!PVE::QemuServer::check_running($vmid)) {
>   	$self->{vm_was_running} = 0;
> +    } elsif (PVE::QemuServer::vm_is_paused($vmid)) {
> +	$self->{vm_was_paused} = 1;
>       }
>   
>       $task->{hostname} = $conf->{name};
> @@ -176,12 +179,16 @@ sub start_vm {
>   sub suspend_vm {
>       my ($self, $task, $vmid) = @_;
>   
> +    return if $self->{vm_was_paused};
> +
>       $self->cmd ("qm suspend $vmid --skiplock");
>   }
>   
>   sub resume_vm {
>       my ($self, $task, $vmid) = @_;
>   
> +    return if $self->{vm_was_paused};
> +
>       $self->cmd ("qm resume $vmid --skiplock");
>   }
>   
> @@ -794,7 +801,7 @@ sub _get_task_devlist {
>   
>   sub qga_fs_freeze {
>       my ($self, $task, $vmid) = @_;
> -    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 'stop' || !$self->{vm_was_running};
> +    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 'stop' || !$self->{vm_was_running} || $self->{vm_was_paused};

It's possible to suspend and resume VMs during backup. E.g. somebody 
else can resume the VM (even without using --skiplock) right after the 
backup task called suspend_vm. Relying on vm_was_paused here leads to 
skipping the fs-freeze for such an (admittedly unlikely) edge case.

>   
>       if (!PVE::QemuServer::qga_check_running($vmid, 1)) {
>   	$self->loginfo("skipping guest-agent 'fs-freeze', agent configured but not running?");
> @@ -872,7 +879,7 @@ sub register_qmeventd_handle {
>   sub resume_vm_after_job_start {
>       my ($self, $task, $vmid) = @_;
>   
> -    return if !$self->{vm_was_running};
> +    return if !$self->{vm_was_running} || $self->{vm_was_paused};
>   
>       if (my $stoptime = $task->{vmstoptime}) {
>   	my $delay = time() - $task->{vmstoptime};
> 




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before
  2021-01-20 14:15 ` Fabian Ebner
@ 2021-01-21  9:26   ` Dominik Csapak
  2021-01-21 10:41     ` Fabian Ebner
  0 siblings, 1 reply; 5+ messages in thread
From: Dominik Csapak @ 2021-01-21  9:26 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel

On 1/20/21 3:15 PM, Fabian Ebner wrote:
> One minor thing inline, and that likely needs a fix elsewhere if we even 
> want to bother. Other than that:
> 
> Tested-by: Fabian Ebner <f.ebner@proxmox.com>
> 
> On 20.01.21 13:32, Dominik Csapak wrote:
>> by checking if the vm is paused at the beginning and skipping the resume
>> now we also skip the qga freeze/thaw (which cannot work if the vm is
>> paused)
>>
>> moved the 'vm_is_paused' sub from the api to PVE/QemuServer.pm so it is
>> available everywhere we need it.
>>
>> since a suspend backup would pause the vm anyway, we can skip that step
>> also
>>
>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>> ---
>> changes to v2:
>> * actually include code changes (git add is not optional), sry for the 
>> noise
>>
>>   PVE/API2/Qemu.pm         | 14 ++------------
>>   PVE/QemuServer.pm        | 10 ++++++++++
>>   PVE/VZDump/QemuServer.pm | 11 +++++++++--
>>   3 files changed, 21 insertions(+), 14 deletions(-)
>>
>> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>> index e2d2d67..623c998 100644
>> --- a/PVE/API2/Qemu.pm
>> +++ b/PVE/API2/Qemu.pm
>> @@ -2402,16 +2402,6 @@ __PACKAGE__->register_method({
>>       return $rpcenv->fork_worker('qmreset', $vmid, $authuser, $realcmd);
>>       }});
>> -my sub vm_is_paused {
>> -    my ($vmid) = @_;
>> -    my $qmpstatus = eval {
>> -    PVE::QemuConfig::assert_config_exists_on_node($vmid);
>> -    mon_cmd($vmid, "query-status");
>> -    };
>> -    warn "$@\n" if $@;
>> -    return $qmpstatus && $qmpstatus->{status} eq "paused";
>> -}
>> -
>>   __PACKAGE__->register_method({
>>       name => 'vm_shutdown',
>>       path => '{vmid}/status/shutdown',
>> @@ -2480,7 +2470,7 @@ __PACKAGE__->register_method({
>>       #
>>       # checking the qmp status here to get feedback to the gui/cli/api
>>       # and the status query should not take too long
>> -    if (vm_is_paused($vmid)) {
>> +    if (PVE::QemuServer::vm_is_paused($vmid)) {
>>           if ($param->{forceStop}) {
>>           warn "VM is paused - stop instead of shutdown\n";
>>           $shutdown = 0;
>> @@ -2556,7 +2546,7 @@ __PACKAGE__->register_method({
>>       my $node = extract_param($param, 'node');
>>       my $vmid = extract_param($param, 'vmid');
>> -    die "VM is paused - cannot shutdown\n" if vm_is_paused($vmid);
>> +    die "VM is paused - cannot shutdown\n" if 
>> PVE::QemuServer::vm_is_paused($vmid);
>>       die "VM $vmid not running\n" if 
>> !PVE::QemuServer::check_running($vmid);
>> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
>> index 54278e5..d7ee05f 100644
>> --- a/PVE/QemuServer.pm
>> +++ b/PVE/QemuServer.pm
>> @@ -7382,4 +7382,14 @@ sub complete_migration_storage {
>>       return $res;
>>   }
>> +sub vm_is_paused {
>> +    my ($vmid) = @_;
>> +    my $qmpstatus = eval {
>> +    PVE::QemuConfig::assert_config_exists_on_node($vmid);
>> +    mon_cmd($vmid, "query-status");
>> +    };
>> +    warn "$@\n" if $@;
>> +    return $qmpstatus && $qmpstatus->{status} eq "paused";
>> +}
>> +
>>   1;
>> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
>> index b322701..b5e74d3 100644
>> --- a/PVE/VZDump/QemuServer.pm
>> +++ b/PVE/VZDump/QemuServer.pm
>> @@ -62,8 +62,11 @@ sub prepare {
>>       if defined($conf->{name});
>>       $self->{vm_was_running} = 1;
>> +    $self->{vm_was_paused} = 0;
>>       if (!PVE::QemuServer::check_running($vmid)) {
>>       $self->{vm_was_running} = 0;
>> +    } elsif (PVE::QemuServer::vm_is_paused($vmid)) {
>> +    $self->{vm_was_paused} = 1;
>>       }
>>       $task->{hostname} = $conf->{name};
>> @@ -176,12 +179,16 @@ sub start_vm {
>>   sub suspend_vm {
>>       my ($self, $task, $vmid) = @_;
>> +    return if $self->{vm_was_paused};
>> +
>>       $self->cmd ("qm suspend $vmid --skiplock");
>>   }
>>   sub resume_vm {
>>       my ($self, $task, $vmid) = @_;
>> +    return if $self->{vm_was_paused};
>> +
>>       $self->cmd ("qm resume $vmid --skiplock");
>>   }
>> @@ -794,7 +801,7 @@ sub _get_task_devlist {
>>   sub qga_fs_freeze {
>>       my ($self, $task, $vmid) = @_;
>> -    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 
>> 'stop' || !$self->{vm_was_running};
>> +    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 
>> 'stop' || !$self->{vm_was_running} || $self->{vm_was_paused};
> 
> It's possible to suspend and resume VMs during backup. E.g. somebody 
> else can resume the VM (even without using --skiplock) right after the 
> backup task called suspend_vm. Relying on vm_was_paused here leads to 
> skipping the fs-freeze for such an (admittedly unlikely) edge case.

i mean i get why it is possible to resume / suspend during backup,
but really this should not be possible until the backup started

so maybe we have to have some kind of intermediate locking?

alternatively we can replace that check here
with an actual qmp status check (that is also racy though...)

also if i suspend between the paused check and the resume part
it will resume it again..

> 
>>       if (!PVE::QemuServer::qga_check_running($vmid, 1)) {
>>       $self->loginfo("skipping guest-agent 'fs-freeze', agent 
>> configured but not running?");
>> @@ -872,7 +879,7 @@ sub register_qmeventd_handle {
>>   sub resume_vm_after_job_start {
>>       my ($self, $task, $vmid) = @_;
>> -    return if !$self->{vm_was_running};
>> +    return if !$self->{vm_was_running} || $self->{vm_was_paused};
>>       if (my $stoptime = $task->{vmstoptime}) {
>>       my $delay = time() - $task->{vmstoptime};
>>





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [pve-devel] [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before
  2021-01-21  9:26   ` Dominik Csapak
@ 2021-01-21 10:41     ` Fabian Ebner
  0 siblings, 0 replies; 5+ messages in thread
From: Fabian Ebner @ 2021-01-21 10:41 UTC (permalink / raw)
  To: Dominik Csapak, pve-devel

On 21.01.21 10:26, Dominik Csapak wrote:
> On 1/20/21 3:15 PM, Fabian Ebner wrote:
>> One minor thing inline, and that likely needs a fix elsewhere if we 
>> even want to bother. Other than that:
>>
>> Tested-by: Fabian Ebner <f.ebner@proxmox.com>
>>
>> On 20.01.21 13:32, Dominik Csapak wrote:
>>> by checking if the vm is paused at the beginning and skipping the resume
>>> now we also skip the qga freeze/thaw (which cannot work if the vm is
>>> paused)
>>>
>>> moved the 'vm_is_paused' sub from the api to PVE/QemuServer.pm so it is
>>> available everywhere we need it.
>>>
>>> since a suspend backup would pause the vm anyway, we can skip that step
>>> also
>>>
>>> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
>>> ---
>>> changes to v2:
>>> * actually include code changes (git add is not optional), sry for 
>>> the noise
>>>
>>>   PVE/API2/Qemu.pm         | 14 ++------------
>>>   PVE/QemuServer.pm        | 10 ++++++++++
>>>   PVE/VZDump/QemuServer.pm | 11 +++++++++--
>>>   3 files changed, 21 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/PVE/API2/Qemu.pm b/PVE/API2/Qemu.pm
>>> index e2d2d67..623c998 100644
>>> --- a/PVE/API2/Qemu.pm
>>> +++ b/PVE/API2/Qemu.pm
>>> @@ -2402,16 +2402,6 @@ __PACKAGE__->register_method({
>>>       return $rpcenv->fork_worker('qmreset', $vmid, $authuser, 
>>> $realcmd);
>>>       }});
>>> -my sub vm_is_paused {
>>> -    my ($vmid) = @_;
>>> -    my $qmpstatus = eval {
>>> -    PVE::QemuConfig::assert_config_exists_on_node($vmid);
>>> -    mon_cmd($vmid, "query-status");
>>> -    };
>>> -    warn "$@\n" if $@;
>>> -    return $qmpstatus && $qmpstatus->{status} eq "paused";
>>> -}
>>> -
>>>   __PACKAGE__->register_method({
>>>       name => 'vm_shutdown',
>>>       path => '{vmid}/status/shutdown',
>>> @@ -2480,7 +2470,7 @@ __PACKAGE__->register_method({
>>>       #
>>>       # checking the qmp status here to get feedback to the gui/cli/api
>>>       # and the status query should not take too long
>>> -    if (vm_is_paused($vmid)) {
>>> +    if (PVE::QemuServer::vm_is_paused($vmid)) {
>>>           if ($param->{forceStop}) {
>>>           warn "VM is paused - stop instead of shutdown\n";
>>>           $shutdown = 0;
>>> @@ -2556,7 +2546,7 @@ __PACKAGE__->register_method({
>>>       my $node = extract_param($param, 'node');
>>>       my $vmid = extract_param($param, 'vmid');
>>> -    die "VM is paused - cannot shutdown\n" if vm_is_paused($vmid);
>>> +    die "VM is paused - cannot shutdown\n" if 
>>> PVE::QemuServer::vm_is_paused($vmid);
>>>       die "VM $vmid not running\n" if 
>>> !PVE::QemuServer::check_running($vmid);
>>> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
>>> index 54278e5..d7ee05f 100644
>>> --- a/PVE/QemuServer.pm
>>> +++ b/PVE/QemuServer.pm
>>> @@ -7382,4 +7382,14 @@ sub complete_migration_storage {
>>>       return $res;
>>>   }
>>> +sub vm_is_paused {
>>> +    my ($vmid) = @_;
>>> +    my $qmpstatus = eval {
>>> +    PVE::QemuConfig::assert_config_exists_on_node($vmid);
>>> +    mon_cmd($vmid, "query-status");
>>> +    };
>>> +    warn "$@\n" if $@;
>>> +    return $qmpstatus && $qmpstatus->{status} eq "paused";
>>> +}
>>> +
>>>   1;
>>> diff --git a/PVE/VZDump/QemuServer.pm b/PVE/VZDump/QemuServer.pm
>>> index b322701..b5e74d3 100644
>>> --- a/PVE/VZDump/QemuServer.pm
>>> +++ b/PVE/VZDump/QemuServer.pm
>>> @@ -62,8 +62,11 @@ sub prepare {
>>>       if defined($conf->{name});
>>>       $self->{vm_was_running} = 1;
>>> +    $self->{vm_was_paused} = 0;
>>>       if (!PVE::QemuServer::check_running($vmid)) {
>>>       $self->{vm_was_running} = 0;
>>> +    } elsif (PVE::QemuServer::vm_is_paused($vmid)) {
>>> +    $self->{vm_was_paused} = 1;
>>>       }
>>>       $task->{hostname} = $conf->{name};
>>> @@ -176,12 +179,16 @@ sub start_vm {
>>>   sub suspend_vm {
>>>       my ($self, $task, $vmid) = @_;
>>> +    return if $self->{vm_was_paused};
>>> +
>>>       $self->cmd ("qm suspend $vmid --skiplock");
>>>   }
>>>   sub resume_vm {
>>>       my ($self, $task, $vmid) = @_;
>>> +    return if $self->{vm_was_paused};
>>> +
>>>       $self->cmd ("qm resume $vmid --skiplock");
>>>   }
>>> @@ -794,7 +801,7 @@ sub _get_task_devlist {
>>>   sub qga_fs_freeze {
>>>       my ($self, $task, $vmid) = @_;
>>> -    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 
>>> 'stop' || !$self->{vm_was_running};
>>> +    return if !$self->{vmlist}->{$vmid}->{agent} || $task->{mode} eq 
>>> 'stop' || !$self->{vm_was_running} || $self->{vm_was_paused};
>>
>> It's possible to suspend and resume VMs during backup. E.g. somebody 
>> else can resume the VM (even without using --skiplock) right after the 
>> backup task called suspend_vm. Relying on vm_was_paused here leads to 
>> skipping the fs-freeze for such an (admittedly unlikely) edge case.
> 
> i mean i get why it is possible to resume / suspend during backup,
> but really this should not be possible until the backup started
> 
> so maybe we have to have some kind of intermediate locking?
> 

Yes, it should be enough if there is something like a 'backup-init' lock 
in the config during the initial phase. The vm_resume function 
explicitly checks for the 'backup' lock to decide if it can proceed.

> alternatively we can replace that check here
> with an actual qmp status check (that is also racy though...)
> 
> also if i suspend between the paused check and the resume part
> it will resume it again..
> 

If we had the above locking, I think it should work for snapshot mode 
already. Only suspend mode has the resume call at the very end in the 
cleanup phase of the backup. And maybe that call isn't actually needed ?
* for containers 'cleanup->{resume} is set to 0' again, so it doesn't matter
* for VMs, it's resumed within archive() via resume_vm_after_job_start() 
already before that cleanup phase
Or should suspend mode force the VM to be suspended during the whole backup?

Last, but not least (from the docs):

suspend mode
This mode is provided for compatibility reason, and suspends the VM 
before calling the snapshot mode. Since suspending the VM results in a 
longer downtime and does not necessarily improve the data consistency, 
the use of the snapshot mode is recommended instead.

Are there any plans to remove suspend mode for VMs for PVE 7.0?

>>
>>>       if (!PVE::QemuServer::qga_check_running($vmid, 1)) {
>>>       $self->loginfo("skipping guest-agent 'fs-freeze', agent 
>>> configured but not running?");
>>> @@ -872,7 +879,7 @@ sub register_qmeventd_handle {
>>>   sub resume_vm_after_job_start {
>>>       my ($self, $task, $vmid) = @_;
>>> -    return if !$self->{vm_was_running};
>>> +    return if !$self->{vm_was_running} || $self->{vm_was_paused};
>>>       if (my $stoptime = $task->{vmstoptime}) {
>>>       my $delay = time() - $task->{vmstoptime};
>>>
> 
> 




^ permalink raw reply	[flat|nested] 5+ messages in thread

* [pve-devel] applied: [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before
  2021-01-20 12:32 [pve-devel] [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before Dominik Csapak
  2021-01-20 14:15 ` Fabian Ebner
@ 2021-01-26 17:43 ` Thomas Lamprecht
  1 sibling, 0 replies; 5+ messages in thread
From: Thomas Lamprecht @ 2021-01-26 17:43 UTC (permalink / raw)
  To: Proxmox VE development discussion, Dominik Csapak

On 20.01.21 13:32, Dominik Csapak wrote:
> by checking if the vm is paused at the beginning and skipping the resume
> now we also skip the qga freeze/thaw (which cannot work if the vm is
> paused)
> 
> moved the 'vm_is_paused' sub from the api to PVE/QemuServer.pm so it is
> available everywhere we need it.
> 
> since a suspend backup would pause the vm anyway, we can skip that step
> also
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes to v2:
> * actually include code changes (git add is not optional), sry for the noise
> 
>  PVE/API2/Qemu.pm         | 14 ++------------
>  PVE/QemuServer.pm        | 10 ++++++++++
>  PVE/VZDump/QemuServer.pm | 11 +++++++++--
>  3 files changed, 21 insertions(+), 14 deletions(-)
> 
>

applied, with Fabi's T-b, thanks!

Not sure if it'd be really worth it to do the backup-init lock proposed in
a thread reply here though..




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-01-26 17:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-20 12:32 [pve-devel] [PATCH qemu-server v3] fix #2788: do not resume vms after backup if they were paused before Dominik Csapak
2021-01-20 14:15 ` Fabian Ebner
2021-01-21  9:26   ` Dominik Csapak
2021-01-21 10:41     ` Fabian Ebner
2021-01-26 17:43 ` [pve-devel] applied: " Thomas Lamprecht

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal