all lists on lists.proxmox.com
 help / color / mirror / Atom feed
From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Subject: Re: [pve-devel] [PATCH pve-guest-common 1/1] partially fix #2530: snapshots: add pre/post/failed-snapshot hooks
Date: Wed, 21 Dec 2022 11:44:34 +0100	[thread overview]
Message-ID: <1671617798.6ip8zjgw5r.astroid@yuna.none> (raw)
In-Reply-To: <20221212134312.257662-3-s.hanreich@proxmox.com>

this is v2, right? ;)

On December 12, 2022 2:43 pm, Stefan Hanreich wrote:
> This commit adds hooks to the snapshotting process, which can be used
> to run additional setup scripts to prepare the VM for snapshotting.
> 
> Examples for use cases include:
> * forcing processes to flush their writes
> * blocking processes from writing
> * altering the configuration of the VM to make snapshotting possible
> 
> The prepare step has been split into two parts, so the configuration
> can be locked a bit earlier during the snapshotting process. Doing it
> this way ensures that the configuration is already locked during the
> pre-snapshot hook. Because of this split, the VM config gets written
> in two stages now, rather than one.
> 
> In case of failure during the preparation step - after the lock is
> written - error handling has been added so the lock gets released
> properly. The failed-snapshot hook runs when the snapshot fails, if
> the pre-snapshot hook ran already. This enables users to revert any
> changes done during the pre-snapshot hookscript.

see below
 
> The preparation step assumes that the hook does not convert the
> current VM into a template, which is why the basic checks are not
> re-run after the pre-snapshot hook. The storage check runs after the
> pre-snapshot hook, because the hook might get used to setup the
> storage for snapshotting. If the hook would run after the storage
> checks, this becomes impossible.
> 
> cfs_update() gets called after every invocation of a hookscript, since
> it is impossible to know which changes get made by the hookscript.
> Doing this ensures that we see the updated state of the CFS after the
> hookscript got invoked.
> 
> Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
> ---
>  src/PVE/AbstractConfig.pm | 49 ++++++++++++++++++++++++++++++++++-----
>  1 file changed, 43 insertions(+), 6 deletions(-)
> 
> diff --git a/src/PVE/AbstractConfig.pm b/src/PVE/AbstractConfig.pm
> index a0c0bc6..3bff600 100644
> --- a/src/PVE/AbstractConfig.pm
> +++ b/src/PVE/AbstractConfig.pm
> @@ -710,8 +710,7 @@ sub __snapshot_prepare {
>  
>      my $snap;
>  
> -    my $updatefn =  sub {
> -
> +    my $run_checks = sub {
>  	my $conf = $class->load_config($vmid);
>  
>  	die "you can't take a snapshot if it's a template\n"
> @@ -721,15 +720,21 @@ sub __snapshot_prepare {
>  
>  	$conf->{lock} = 'snapshot';
>  
> -	my $snapshots = $conf->{snapshots};
> -
>  	die "snapshot name '$snapname' already used\n"
> -	    if defined($snapshots->{$snapname});
> +	    if defined($conf->{snapshots}->{$snapname});
> +
> +	$class->write_config($vmid, $conf);
> +    };
>  
> +    my $updatefn = sub {
> +	my $conf = $class->load_config($vmid);
>  	my $storecfg = PVE::Storage::config();
> +
>  	die "snapshot feature is not available\n"
>  	    if !$class->has_feature('snapshot', $conf, $storecfg, undef, undef, $snapname eq 'vzdump');
>  
> +	my $snapshots = $conf->{snapshots};
> +
>  	for my $snap (sort keys %$snapshots) {
>  	    my $parent_name = $snapshots->{$snap}->{parent} // '';
>  	    if ($snapname eq $parent_name) {
> @@ -753,7 +758,32 @@ sub __snapshot_prepare {
>  	$class->write_config($vmid, $conf);
>      };
>  
> -    $class->lock_config($vmid, $updatefn);
> +    $class->lock_config($vmid, $run_checks);
> +
> +    eval {
> +	my $conf = $class->load_config($vmid);
> +	PVE::GuestHelpers::exec_hookscript($conf, $vmid, "pre-snapshot", 1);
> +    };
> +    my $err = $@;
> +
> +    PVE::Cluster::cfs_update();
> +
> +    if ($err) {
> +	$class->remove_lock($vmid, 'snapshot');
> +	die $err;
> +    }
> +

I wonder if we don't also want to call the 'failed-snapshot' phase when just the
pre-snapshot invocation failed? might be possible to combine the error handling
then, although I am not sure it makes it more readable if combined..

> +   
> +    if (my $err = $@) {
> +	my $conf = $class->load_config($vmid);
> +	PVE::GuestHelpers::exec_hookscript($conf, $vmid, "failed-snapshot");

this exec_hookscript needs to be inside an eval {}, with warn in case it fails..

also, this call here happens when preparing for making the snapshot, after
possibly saving the VM state, but before taking the volume snapshots..

> +	PVE::Cluster::cfs_update();
> +
> +	$class->remove_lock($vmid, 'snapshot');
> +	die $err;
> +    }
>  
>      return $snap;
>  }
> @@ -837,11 +867,18 @@ sub snapshot_create {
>  
>      if ($err) {
>  	warn "snapshot create failed: starting cleanup\n";
> +
> +	PVE::GuestHelpers::exec_hookscript($conf, $vmid, "failed-snapshot");

eval + warn as well

this call here happens when the volume snapshots might or might not have been
created already (depending on what exactly the error cause is).

> +	PVE::Cluster::cfs_update();
> +
>  	eval { $class->snapshot_delete($vmid, $snapname, 1, $drivehash); };
>  	warn "$@" if $@;
>  	die "$err\n";
>      }
>  
> +    PVE::GuestHelpers::exec_hookscript($conf, $vmid, "post-snapshot");

and here we have a similar issue (no eval), what should happen if post-snapshot
fails?

A die immediately (very likely wrong, current)
B eval + warn but proceed with commit (possibly leaving leftover hook changes around)
C eval + warn, call failed-snapshot but proceed with commit (gives the
  hookscript a chance to cleanup, but how does it differentiate between the
  different failed-snapshot call sites?)
D eval + delete snapshot (seems suboptimal)
E eval + call failed-snapshot + delete snapshot (same, and also the issue of the
  hookscript being able to know what's going on again)

B and C seem most sensible to me, but C adds to the issue of "missing
failed-snapshot context", depending on what the hookscript is doing..

one way to pass information is via the environment, we do that for the migration
case already (setting PVE_MIGRATED_FROM, so that the pre-start/post-start
hookscript can know the start happens in a migration context, and where to
(possibly) find the guest config..

for example, we could set PVE_SNAPSHOT_PHASE here, and have prepare/commit/post
as sub-phases, or even pass a list of volumes already snapshotted (or created,
in case of vmstate), or ..

obviously setting the environment is only allowed in a forked worker context,
else it would affect the next API endpoint handled by the pveproxy/pvedaemon/..
process, so it might be worth double-checking and cleaning up to avoid
side-effects with replication/migration/.. if we go down that route..

> +    PVE::Cluster::cfs_update();
> +
>      $class->__snapshot_commit($vmid, $snapname);
>  }
>  
> -- 
> 2.30.2
> 
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 
> 
> 




  reply	other threads:[~2022-12-21 10:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-12 13:43 [pve-devel] [PATCH v2 pve-guest-common/pve-docs] Add " Stefan Hanreich
2022-12-12 13:43 ` [pve-devel] [PATCH pve-docs 1/1] examples: add pre/post/failed-snapshot hooks to example hookscript Stefan Hanreich
2022-12-12 13:43 ` [pve-devel] [PATCH pve-guest-common 1/1] partially fix #2530: snapshots: add pre/post/failed-snapshot hooks Stefan Hanreich
2022-12-21 10:44   ` Fabian Grünbichler [this message]
2022-12-21 11:26     ` Stefan Hanreich
2022-12-21 12:41       ` Fabian Grünbichler
2022-12-21 12:57         ` Stefan Hanreich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1671617798.6ip8zjgw5r.astroid@yuna.none \
    --to=f.gruenbichler@proxmox.com \
    --cc=pve-devel@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal