public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH pve-manager] fix #3369: auto-start vms after failed pbs backup
@ 2021-04-07 14:23 Dylan Whyte
  2021-04-08  6:41 ` Fabian Grünbichler
  0 siblings, 1 reply; 2+ messages in thread
From: Dylan Whyte @ 2021-04-07 14:23 UTC (permalink / raw)
  To: pve-devel

Fixes an issue in which a VM fails to automatically restart after a
failed stop-mode backup to pbs.

Signed-off-by: Dylan Whyte <d.whyte@proxmox.com>
---
 PVE/VZDump.pm | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Notes:
1. The 1sec time delay was needed, as the check to see if the VM is running
was still true while this code was executed (although the vm was just
about to stop)

2. The previously used vm_status call just checks if a PID exists and
returns true if so. This also returns true when the VM is in "prelauch"
state, hence PVE::QemuServer::vmstatus was used to see the exact state
and handle the situation accordingly. Otherwise, the VM gets stuck in
prelauch state from time to time.


diff --git a/PVE/VZDump.pm b/PVE/VZDump.pm
index fb4c8bad..1bda1f15 100644
--- a/PVE/VZDump.pm
+++ b/PVE/VZDump.pm
@@ -23,6 +23,7 @@ use PVE::VZDump::Common;
 use PVE::VZDump::Plugin;
 use PVE::Tools qw(extract_param split_list);
 use PVE::API2Tools;
+use PVE::QemuServer;
 
 my @posix_filesystems = qw(ext3 ext4 nfs nfs4 reiserfs xfs);
 
@@ -1039,10 +1040,17 @@ sub exec_backup_task {
 		    debugmsg ('info', "resume vm", $logfd);
 		    $plugin->resume_vm ($task, $vmid);
 		} else {
-		    my $running = $plugin->vm_status($vmid);
-		    if (!$running) {
+		    sleep(1);
+		    my $vmstatus = PVE::QemuServer::vmstatus($vmid, 1);
+		    my $stat = $vmstatus->{$vmid};
+		    my $status = $stat->{qmpstatus};
+
+		    if ($status eq "stopped") {
+	    		$plugin->start_vm ($task, $vmid);
+    			debugmsg ('info', "restarting vm", $logfd);
+		    } elsif ($status eq "prelaunch") {
+			$plugin->resume_vm ($task, $vmid);
 			debugmsg ('info', "restarting vm", $logfd);
-			$plugin->start_vm ($task, $vmid);
 		    }
 		}
 		$self->run_hook_script ('post-restart', $task, $logfd);
-- 
2.20.1





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [pve-devel] [PATCH pve-manager] fix #3369: auto-start vms after failed pbs backup
  2021-04-07 14:23 [pve-devel] [PATCH pve-manager] fix #3369: auto-start vms after failed pbs backup Dylan Whyte
@ 2021-04-08  6:41 ` Fabian Grünbichler
  0 siblings, 0 replies; 2+ messages in thread
From: Fabian Grünbichler @ 2021-04-08  6:41 UTC (permalink / raw)
  To: Proxmox VE development discussion

On April 7, 2021 4:23 pm, Dylan Whyte wrote:
> Fixes an issue in which a VM fails to automatically restart after a
> failed stop-mode backup to pbs.
> 
> Signed-off-by: Dylan Whyte <d.whyte@proxmox.com>
> ---
>  PVE/VZDump.pm | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> Notes:
> 1. The 1sec time delay was needed, as the check to see if the VM is running
> was still true while this code was executed (although the vm was just
> about to stop)
> 
> 2. The previously used vm_status call just checks if a PID exists and
> returns true if so. This also returns true when the VM is in "prelauch"
> state, hence PVE::QemuServer::vmstatus was used to see the exact state
> and handle the situation accordingly. Otherwise, the VM gets stuck in
> prelauch state from time to time.
> 
> 
> diff --git a/PVE/VZDump.pm b/PVE/VZDump.pm
> index fb4c8bad..1bda1f15 100644
> --- a/PVE/VZDump.pm
> +++ b/PVE/VZDump.pm
> @@ -23,6 +23,7 @@ use PVE::VZDump::Common;
>  use PVE::VZDump::Plugin;
>  use PVE::Tools qw(extract_param split_list);
>  use PVE::API2Tools;
> +use PVE::QemuServer;
>  
>  my @posix_filesystems = qw(ext3 ext4 nfs nfs4 reiserfs xfs);
>  
> @@ -1039,10 +1040,17 @@ sub exec_backup_task {
>  		    debugmsg ('info', "resume vm", $logfd);
>  		    $plugin->resume_vm ($task, $vmid);
>  		} else {
> -		    my $running = $plugin->vm_status($vmid);
> -		    if (!$running) {
> +		    sleep(1);

I wonder where this second comes from? some kind of timeout in PBS code?

> +		    my $vmstatus = PVE::QemuServer::vmstatus($vmid, 1);

we don't know this is a VM?

> +		    my $stat = $vmstatus->{$vmid};
> +		    my $status = $stat->{qmpstatus};
> +
> +		    if ($status eq "stopped") {
> +	    		$plugin->start_vm ($task, $vmid);
> +    			debugmsg ('info', "restarting vm", $logfd);
> +		    } elsif ($status eq "prelaunch") {
> +			$plugin->resume_vm ($task, $vmid);

this can occur if the

- VM was runnning at the start of the backup, but with stop mode
- a problem occured while the VM is in the prelaunch state

normally, the qemu-server VZDump plugin handles resuming. but there are 
two 'die' statements in archive_pbs that can trigger before resuming 
happens, and restoring the power state does nothing if the VM is already 
running. so either of those two should be fixed to handle the prelaunch 
issue.

the prelaunch issue also seems to affect VMA, although it might be 
harder to reliably trigger an error during the initial backup start 
window.

>  			debugmsg ('info', "restarting vm", $logfd);
> -			$plugin->start_vm ($task, $vmid);
>  		    }
>  		}
>  		$self->run_hook_script ('post-restart', $task, $logfd);
> -- 
> 2.20.1
> 
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-04-08  6:41 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-07 14:23 [pve-devel] [PATCH pve-manager] fix #3369: auto-start vms after failed pbs backup Dylan Whyte
2021-04-08  6:41 ` Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal