public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* Re: [pve-devel] [PATCH storage 4/6] Add prune_backups to storage API
       [not found]   ` <1594035985.yxpjph40wm.astroid@nora.none>
@ 2020-07-08  7:20     ` Fabian Ebner
  2020-07-08  7:26     ` Fabian Ebner
  1 sibling, 0 replies; 4+ messages in thread
From: Fabian Ebner @ 2020-07-08  7:20 UTC (permalink / raw)
  To: pve-devel; +Cc: Fabian Grünbichler, Thomas Lamprecht

Am 07.07.20 um 08:38 schrieb Fabian Grünbichler:
> On June 4, 2020 11:08 am, Fabian Ebner wrote:
>> Implement it for generic storages supporting backups (i.e.
>> directory-based storages) and add a wrapper for PBS.
>>
>> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>> ---
>>   PVE/Storage.pm             |  27 ++++-
>>   PVE/Storage/PBSPlugin.pm   |  50 ++++++++
>>   PVE/Storage/Plugin.pm      | 128 ++++++++++++++++++++
>>   test/prune_backups_test.pm | 237 +++++++++++++++++++++++++++++++++++++
>>   test/run_plugin_tests.pl   |   1 +
>>   5 files changed, 441 insertions(+), 2 deletions(-)
>>   create mode 100644 test/prune_backups_test.pm
>>
>> diff --git a/PVE/Storage.pm b/PVE/Storage.pm
>> index 07a4f53..e1d0d02 100755
>> --- a/PVE/Storage.pm
>> +++ b/PVE/Storage.pm
>> @@ -40,11 +40,11 @@ use PVE::Storage::DRBDPlugin;
>>   use PVE::Storage::PBSPlugin;
>>   
>>   # Storage API version. Icrement it on changes in storage API interface.
>> -use constant APIVER => 5;
>> +use constant APIVER => 6;
>>   # Age is the number of versions we're backward compatible with.
>>   # This is like having 'current=APIVER' and age='APIAGE' in libtool,
>>   # see https://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html
>> -use constant APIAGE => 4;
>> +use constant APIAGE => 5;
>>   
>>   # load standard plugins
>>   PVE::Storage::DirPlugin->register();
>> @@ -1522,6 +1522,29 @@ sub extract_vzdump_config {
>>       }
>>   }
>>   
>> +sub prune_backups {
>> +    my ($cfg, $storeid, $opts_override, $vmid, $dryrun) = @_;
> 
> s/opts_override/opts/
> s/prune_opts/opts/
> 
> or maybe $keep? since we only have either, and they both represent the
> same options..
> 

Ok

>> +
>> +    my $scfg = storage_config($cfg, $storeid);
>> +    die "storage '$storeid' does not support backups\n" if !$scfg->{content}->{backup};
>> +
>> +    my $prune_opts = $opts_override;
> 
> not needed then
> 
>> +    if (!defined($prune_opts)) {
>> +	die "no prune-backups options configured for storage '$storeid'\n"
>> +	    if !defined($scfg->{'prune-backups'});
>> +	$prune_opts = PVE::JSONSchema::parse_property_string('prune-backups', $scfg->{'prune-backups'});
>> +    }
>> +
>> +    my $all_zero = 1;
>> +    foreach my $opt (keys %{$prune_opts}) {
>> +	$all_zero = 0 if defined($prune_opts->{$opt}) && $prune_opts->{$opt} > 0;
>> +    }
>> +    die "at least one keep-option must be set and positive\n" if $all_zero;
> 
> if we switch to integer:
> 
> die ...
>      if !grep { defined($opts->{$_}) && $opts->{$_} } keys %$opts;
> 
> or we could use the new format validator functionality to ensure this
> directly in parse_property_string ;)
> 

Yes, I felt like here would be a good use case for the validation when 
writing this. And using 'values' instead of 'keys' it becomes even 
simpler. I'll switch to integer and allow 0 as well.

>> +
>> +    my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
>> +    return $plugin->prune_backups($scfg, $storeid, $prune_opts, $vmid, $dryrun);
>> +}
>> +
>>   sub volume_export {
>>       my ($cfg, $fh, $volid, $format, $snapshot, $base_snapshot, $with_snapshots) = @_;
>>   
>> diff --git a/PVE/Storage/PBSPlugin.pm b/PVE/Storage/PBSPlugin.pm
>> index fba4b2b..f37bd66 100644
>> --- a/PVE/Storage/PBSPlugin.pm
>> +++ b/PVE/Storage/PBSPlugin.pm
>> @@ -173,6 +173,56 @@ sub extract_vzdump_config {
>>       return $config;
>>   }
>>   
>> +sub prune_backups {
>> +    my ($class, $scfg, $storeid, $prune_opts, $vmid, $dryrun) = @_;
>> +
>> +    my $backups = $class->list_volumes($storeid, $scfg, $vmid, ['backup']);
>> +
>> +    my $backup_groups = {};
>> +    foreach my $backup (@{$backups}) {
>> +	(my $guest_type = $backup->{format}) =~ s/^pbs-//;
>> +	my $backup_group = "$guest_type/$backup->{vmid}";
>> +	$backup_groups->{$backup_group} = 1;
>> +    }
>> +
>> +    my @param;
>> +    foreach my $prune_option (keys %{$prune_opts}) {
>> +	push @param, "--$prune_option";
>> +	push @param, "$prune_opts->{$prune_option}";
>> +    }
>> +
>> +    push @param, '--dry-run' if $dryrun;
>> +    push @param, '--output-format=json';
>> +
>> +    my @prune_list;
> 
> why not a reference? would make extending this easier in the future wrt
> return values
> 

Ok

>> +
>> +    foreach my $backup_group (keys %{$backup_groups}) {
>> +	my $json_str;
>> +	my $outfunc = sub { $json_str .= "$_[0]\n" };
> 
> a simple
> 
> sub { $json_str .= shift }
> 
> should be enough, since we don't actually care about the new lines?
> 
>> +
>> +	run_raw_client_cmd($scfg, $storeid, 'prune', [ $backup_group, @param ],
>> +			   outfunc => $outfunc, errmsg => 'proxmox-backup-client failed');
> 
> error handling here
> 
>> +
>> +	my $res = decode_json($json_str);
> 
> and here would be nice (at least to get better error messages, possibly
> to warn and continue with pruning other backup groups and die at the
> end?)
> 
>> +	foreach my $backup (@{$res}) {
>> +	    my $type = $backup->{'backup-type'};
>> +	    my $vmid = $backup->{'backup-id'};
>> +	    my $ctime = $backup->{'backup-time'};
>> +	    my $time_str = localtime($ctime);
>> +	    my $prune_entry = {
>> +		volid => "$type/$vmid-$time_str",
>> +		type => $type eq 'vm' ? 'qemu' : 'lxc',
>> +		ctime => $ctime,
>> +		vmid => $vmid,
>> +		mark => $backup->{keep} ? 'keep' : 'remove',
> 
> depending on how stable this interface is, it might make sense to ensure
> all the expected keys are actually defined..
> 

I'll add an eval around each backup_group iteration and simply warn on 
failure or if some keys are not defined. I'll also switch over to using 
run_client_command instead of run_raw_client_command.

>> +	    };
>> +	    push @prune_list, $prune_entry;
>> +	}
>> +    }
>> +
>> +    return @prune_list;
>> +}
>> +
>>   sub on_add_hook {
>>       my ($class, $storeid, $scfg, %param) = @_;
>>   
>> diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm
>> index 1ba1b99..a11bba5 100644
>> --- a/PVE/Storage/Plugin.pm
>> +++ b/PVE/Storage/Plugin.pm
>> @@ -8,6 +8,7 @@ use File::chdir;
>>   use File::Path;
>>   use File::Basename;
>>   use File::stat qw();
>> +use POSIX qw(strftime);
>>   
>>   use PVE::Tools qw(run_command);
>>   use PVE::JSONSchema qw(get_standard_option register_standard_option);
>> @@ -1167,6 +1168,133 @@ sub check_connection {
>>       return 1;
>>   }
>>   
>> +my $prune_backups_mark = sub {
>> +    my ($prune_entries, $keep_count, $id_func) = @_;
>> +
>> +    return if !defined($keep_count);
> 
> defined is not needed? currently it's >= 1 anyway, and even if we switch
> to >= 0 we'd return on 0 here
> 

Ok

>> +
>> +    my $already_included = {};
>> +
>> +    # determine time covered by previous selections
>> +    foreach my $prune_entry (@{$prune_entries}) {
>> +	my $mark = $prune_entry->{mark};
>> +	if (defined($mark) && $mark eq 'keep') {
>> +	    my $id = $id_func->($prune_entry->{ctime});
>> +	    $already_included->{$id} = 1;
>> +	}
>> +    }
> 
> since $prune_entries needs to be sorted already for all of this to work,
> couldn't we combine the two loops? like so:
> 
> 
> diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm
> index d47e837..8d5d24d 100644
> --- a/PVE/Storage/Plugin.pm
> +++ b/PVE/Storage/Plugin.pm
> @@ -1173,22 +1173,18 @@ my $prune_backups_mark = sub {
>       return if !defined($keep_count);
>   
>       my $already_included = {};
> +    my $newly_included = {};
>   
>       # determine time covered by previous selections
>       foreach my $prune_entry (@{$prune_entries}) {
>   	my $mark = $prune_entry->{mark};
> +	my $id = $id_func->($prune_entry->{ctime});
> +
>   	if (defined($mark) && $mark eq 'keep') {
> -	    my $id = $id_func->($prune_entry->{ctime});
>   	    $already_included->{$id} = 1;
>   	}
> -    }
>   
> -    my $newly_included = {};
> -
> -    foreach my $prune_entry (@{$prune_entries}) {
> -	next if defined($prune_entry->{mark});
> -
> -	my $id = $id_func->($prune_entry->{ctime});
> +	next if defined($mark);
>   	next if $already_included->{$id};
>   
>   	if (!$newly_included->{$id}) {
> 
> 

Yes, I think that should work.

>> +
>> +    my $newly_included = {};
>> +
>> +    foreach my $prune_entry (@{$prune_entries}) {
>> +	next if defined($prune_entry->{mark});
>> +
>> +	my $id = $id_func->($prune_entry->{ctime});
>> +	next if $already_included->{$id};
>> +
>> +	if (!$newly_included->{$id}) {
>> +	    last if scalar(keys %{$newly_included}) >= $keep_count;
> 
> shouldn't this one be at the start of the loop body?
> 

If one does that, backups that are in the same time slice as the last 
included backup for that time interval are not marked with 'remove' 
right away. This can be problematic, because a {month,year} is not 
necessarily a superset of a week.

So if there is a backup on a Monday, 31st of December and one on 
Tuesday, 1st of January and the first one is included by 'keep-weekly', 
then the second one should be marked with 'remove' and not considered 
for 'keep-{monthly,yearly}'. I'll add a test case for this edge case.

>> +	    $newly_included->{$id} = 1;
>> +	    $prune_entry->{mark} = 'keep';
>> +	} else {
>> +	    $prune_entry->{mark} = 'remove';
>> +	}
>> +    }
>> +};
>> +
>> +sub prune_backups {
>> +    my ($class, $scfg, $storeid, $prune_opts, $vmid, $dryrun) = @_;
>> +
>> +    my $backups = $class->list_volumes($storeid, $scfg, $vmid, ['backup']);
>> +
>> +    my $backup_groups = {};
>> +    my @prune_list;
>> +    foreach my $backup (@{$backups}) {
>> +	my $volid = $backup->{volid};
>> +	my $backup_vmid = $backup->{vmid};
>> +
>> +	my $prune_entry = {
>> +	    volid => $volid,
>> +	    vmid => $backup_vmid,
>> +	    ctime => $backup->{ctime},
>> +	};
>> +
>> +	my $archive_info = eval { PVE::Storage::archive_info($volid) };
>> +
>> +	my $type = defined($archive_info) ? $archive_info->{type} : 'unknown';
>> +	$prune_entry->{type} = $type;
>> +
>> +	if (defined($archive_info) && $archive_info->{is_std_name}) {
>> +	    $prune_entry->{ctime} = $archive_info->{ctime};
>> +	    my $group = "$type/$backup_vmid";
>> +	    push @{$backup_groups->{$group}}, $prune_entry;
>> +	} else {
>> +	    # ignore backups that don't use the standard naming scheme
>> +	    $prune_entry->{mark} = 'protected';
>> +	}
>> +
>> +	push @prune_list, $prune_entry;
>> +    }
>> +
>> +    foreach my $backup_group (values %{$backup_groups}) {
>> +	my @prune_list_for_group = sort { $b->{ctime} <=> $a->{ctime} }
>> +				   @{$backup_group};
>> +
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-last'}, sub {
>> +	    my ($ctime) = @_;
>> +	    return $ctime;
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-hourly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my (undef, undef, $hour, $day, $month, $year) = localtime($ctime);
>> +	    return "$hour/$day/$month/$year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-daily'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my (undef, undef, undef, $day, $month, $year) = localtime($ctime);
>> +	    return "$day/$month/$year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-weekly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my ($sec, $min, $hour, $day, $month, $year) = localtime($ctime);
>> +	    my $iso_week = int(strftime("%V", $sec, $min, $hour, $day, $month - 1, $year - 1900));
>> +	    my $iso_week_year = int(strftime("%G", $sec, $min, $hour, $day, $month - 1, $year - 1900));
>> +	    return "$iso_week/$iso_week_year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-monthly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my (undef, undef, undef, undef, $month, $year) = localtime($ctime);
>> +	    return "$month/$year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-yearly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my $year = (localtime($ctime))[5];
>> +	    return "$year";
> 
> this is broken. e.g., see this map of volids to $ctime to $year:
> 
> 'local:backup/vzdump-lxc-12345-1900_01_01-01_01_01.tar' => {
>                                                               'ctime' => 946684861,
>                                                               'year' => '100'
>                                                             },
> 'local:backup/vzdump-lxc-12345-1950_01_01-01_01_01.tar' => {
>                                                               'ctime' => 2524608061,
>                                                               'year' => '150'
>                                                             },
> 'local:backup/vzdump-lxc-12345-2000_01_01-01_01_01.tar' => {
>                                                               'ctime' => 946684861,
>                                                               'year' => '100'
>                                                             },
> 'local:backup/vzdump-lxc-12345-2050_01_01-01_01_01.tar' => {
>                                                               'ctime' => 2524608061,
>                                                               'year' => '150'
>                                                             },
> 
> a) we also don't want to implement buggy behaviour that bites us in 30
> years
> b) people might think they are clever and add a backup with a timestamp
> long in the future thinking that will make it safe, and it actually does
> the opposite
> 
> (the bug is in archive_info, and probably needs fixing there. also a
> great example why adding extreme cases to test cases is important, that
> would have hopefully caught this just by virtue of getting sorted wrong)
> 

Ok. I'll add a test for archive_info and fix this.

>> +	});
>> +    }
>> +
>> +    foreach my $prune_entry (@prune_list) {
>> +	$prune_entry->{mark} = 'remove' if !defined($prune_entry->{mark});
>> +    }
>> +
>> +    if (!$dryrun) {
>> +	foreach my $prune_entry (@prune_list) {
>> +	    next if $prune_entry->{mark} ne 'remove';
>> +
>> +	    my $volid = $prune_entry->{volid};
>> +	    eval {
>> +		my (undef, $volname) = parse_volume_id($volid);
>> +		$class->free_image($storeid, $scfg, $volname);
> 
> why not PVE::Storage::archive_remove() to remove the log file as well?
> 

Note that this is an old version of the patch series. archive_remove 
didn't exist yet at this point ;)
But don't worry, all of the other comments/suggestions still apply to 
the latest version.

>> +	    };
>> +	    if (my $err = $@) {
>> +		warn "error when removing backup '$volid' - $err\n";
>> +	    }
>> +	}
>> +    }
>> +
>> +    return @prune_list;
>> +}
>> +
>>   # Import/Export interface:
>>   #   Any path based storage is assumed to support 'raw' and 'tar' streams, so
>>   #   the default implementations will return this if $scfg->{path} is set,
>> diff --git a/test/prune_backups_test.pm b/test/prune_backups_test.pm
>> new file mode 100644
>> index 0000000..1ce1b30
>> --- /dev/null
>> +++ b/test/prune_backups_test.pm
>> @@ -0,0 +1,237 @@
>> +package PVE::Storage::TestPruneBackups;
>> +
>> +use strict;
>> +use warnings;
>> +
>> +use lib qw(..);
>> +
>> +use PVE::Storage;
>> +use Test::More;
>> +use Test::MockModule;
>> +
>> +my $storeid = 'BackTest123';
>> +my @vmids = (1234, 9001);
>> +
>> +# only includes the information needed for prune_backups
>> +my $mocked_backups_list = [];
>> +
>> +foreach my $vmid (@vmids) {
>> +    push @{$mocked_backups_list},
>> +	(
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2018_05_26-11_18_21.tar.zst",
>> +		'ctime' => 1527333501,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_18_21.tar.zst",
>> +		'ctime' => 1577791101,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_19_21.tar.zst",
>> +		'ctime' => 1577787561,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-11_18_21.tar.zst",
>> +		'ctime' => 1577877501,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		'ctime' => 1577881101,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-lxc-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		'ctime' => 1577881101,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-$vmid-renamed.tar.zst",
>> +		'ctime' => 1234,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	);
>> +};
>> +my $mock_plugin = Test::MockModule->new('PVE::Storage::Plugin');
>> +$mock_plugin->redefine(list_volumes => sub {
>> +    my ($class, $storeid, $scfg, $vmid, $content_types) = @_;
>> +
>> +    return $mocked_backups_list if !defined($vmid);
>> +
>> +    my @list = grep { $_->{vmid} eq $vmid } @{$mocked_backups_list};
>> +    return \@list;
>> +});
>> +
>> +sub generate_expected {
>> +    my ($vmids, $marks) = @_;
>> +
>> +    my @expected;
>> +    foreach my $vmid (@{$vmids}) {
>> +	push @expected,
>> +	    (
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2018_05_26-11_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1527333501,
>> +		    'mark'  => $marks->[0],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577791101,
>> +		    'mark'  => $marks->[1],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_19_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577791161,
>> +		    'mark'  => $marks->[2],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-11_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577877501,
>> +		    'mark'  => $marks->[3],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577881101,
>> +		    'mark'  => $marks->[4],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-lxc-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		    'type'  => 'lxc',
>> +		    'ctime' => 1577881101,
>> +		    'mark'  => $marks->[5],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-$vmid-renamed.tar.zst",
>> +		    'type'  => 'unknown',
>> +		    'ctime' => 1234,
>> +		    'mark'  => 'protected',
>> +		    'vmid'  => $vmid,
>> +		},
>> +	    );
>> +    }
>> +    my @sorted = sort { $a->{volid} cmp $b->{volid} } @expected;
>> +    return \@sorted;
>> +}
>> +
>> +# an array of test cases, each test is comprised of the following keys:
>> +# description   => to identify a single test
>> +# vmid          => VMID or undef for all
>> +# prune_opts    => override options from storage configuration
>> +# expected      => what prune_backups should return
>> +#
>> +# most of them are created further below
>> +my $tests = [
>> +    {
>> +	description => 'last=3, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-last' => 3,
>> +	},
>> +	expected => generate_expected(\@vmids, ['remove', 'remove', 'keep', 'keep', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'weekly=2, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-weekly' => 2,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'remove', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'daily=weekly=monthly=1, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-daily' => 1,
>> +	    'keep-weekly' => 1,
>> +	    'keep-monthly' => 1,
>> +	},
>> +	expected => generate_expected(\@vmids, ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'hourly=4, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-hourly' => 4,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'keep', 'keep', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'yearly=2, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-yearly' => 2,
>> +	},
>> +	expected => generate_expected(\@vmids, ['remove', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'last=2,hourly=2 one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-last' => 2,
>> +	    'keep-hourly' => 2,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'keep', 'keep', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'last=1,monthly=2, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-last' => 1,
>> +	    'keep-monthly' => 2,
>> +	},
>> +	expected => generate_expected(\@vmids, ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'monthly=3, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-monthly' => 3,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'last=daily=weekly=1, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-last' => 1,
>> +	    'keep-daily' => 1,
>> +	    'keep-weekly' => 1,
>> +	},
>> +	expected => generate_expected(\@vmids, ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'daily=2, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-daily' => 2,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['remove', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +];
>> +
>> +plan tests => scalar @$tests;
>> +
>> +for my $tt (@$tests) {
>> +
>> +    my $got = eval {
>> +	my @list = PVE::Storage::Plugin->prune_backups($tt->{scfg}, $storeid, $tt->{prune_opts}, $tt->{vmid}, 1);
>> +	my @sorted = sort { $a->{volid} cmp $b->{volid} } @list;
>> +	return \@sorted;
>> +    };
>> +    $got = $@ if $@;
>> +
>> +    is_deeply($got, $tt->{expected}, $tt->{description}) || diag(explain($got));
>> +}
>> +
>> +done_testing();
>> +
>> +1;
>> diff --git a/test/run_plugin_tests.pl b/test/run_plugin_tests.pl
>> index 54322bb..d33429a 100755
>> --- a/test/run_plugin_tests.pl
>> +++ b/test/run_plugin_tests.pl
>> @@ -16,6 +16,7 @@ my $res = $harness->runtests(
>>       "path_to_volume_id_test.pm",
>>       "get_subdir_test.pm",
>>       "filesystem_path_test.pm",
>> +    "prune_backups_test.pm",
>>   );
>>   
>>   exit -1 if !$res || $res->{failed} || $res->{parse_errors};
>> -- 
>> 2.20.1
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH storage 4/6] Add prune_backups to storage API
       [not found]   ` <1594035985.yxpjph40wm.astroid@nora.none>
  2020-07-08  7:20     ` [pve-devel] [PATCH storage 4/6] Add prune_backups to storage API Fabian Ebner
@ 2020-07-08  7:26     ` Fabian Ebner
  1 sibling, 0 replies; 4+ messages in thread
From: Fabian Ebner @ 2020-07-08  7:26 UTC (permalink / raw)
  To: pve-devel; +Cc: Fabian Grünbichler, Thomas Lamprecht

Am 07.07.20 um 08:38 schrieb Fabian Grünbichler:
> On June 4, 2020 11:08 am, Fabian Ebner wrote:
>> Implement it for generic storages supporting backups (i.e.
>> directory-based storages) and add a wrapper for PBS.
>>
>> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>> ---
>>   PVE/Storage.pm             |  27 ++++-
>>   PVE/Storage/PBSPlugin.pm   |  50 ++++++++
>>   PVE/Storage/Plugin.pm      | 128 ++++++++++++++++++++
>>   test/prune_backups_test.pm | 237 +++++++++++++++++++++++++++++++++++++
>>   test/run_plugin_tests.pl   |   1 +
>>   5 files changed, 441 insertions(+), 2 deletions(-)
>>   create mode 100644 test/prune_backups_test.pm
>>
>> diff --git a/PVE/Storage.pm b/PVE/Storage.pm
>> index 07a4f53..e1d0d02 100755
>> --- a/PVE/Storage.pm
>> +++ b/PVE/Storage.pm
>> @@ -40,11 +40,11 @@ use PVE::Storage::DRBDPlugin;
>>   use PVE::Storage::PBSPlugin;
>>   
>>   # Storage API version. Icrement it on changes in storage API interface.
>> -use constant APIVER => 5;
>> +use constant APIVER => 6;
>>   # Age is the number of versions we're backward compatible with.
>>   # This is like having 'current=APIVER' and age='APIAGE' in libtool,
>>   # see https://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html
>> -use constant APIAGE => 4;
>> +use constant APIAGE => 5;
>>   
>>   # load standard plugins
>>   PVE::Storage::DirPlugin->register();
>> @@ -1522,6 +1522,29 @@ sub extract_vzdump_config {
>>       }
>>   }
>>   
>> +sub prune_backups {
>> +    my ($cfg, $storeid, $opts_override, $vmid, $dryrun) = @_;
> 
> s/opts_override/opts/
> s/prune_opts/opts/
> 
> or maybe $keep? since we only have either, and they both represent the
> same options..
> 

Ok

>> +
>> +    my $scfg = storage_config($cfg, $storeid);
>> +    die "storage '$storeid' does not support backups\n" if !$scfg->{content}->{backup};
>> +
>> +    my $prune_opts = $opts_override;
> 
> not needed then
> 
>> +    if (!defined($prune_opts)) {
>> +	die "no prune-backups options configured for storage '$storeid'\n"
>> +	    if !defined($scfg->{'prune-backups'});
>> +	$prune_opts = PVE::JSONSchema::parse_property_string('prune-backups', $scfg->{'prune-backups'});
>> +    }
>> +
>> +    my $all_zero = 1;
>> +    foreach my $opt (keys %{$prune_opts}) {
>> +	$all_zero = 0 if defined($prune_opts->{$opt}) && $prune_opts->{$opt} > 0;
>> +    }
>> +    die "at least one keep-option must be set and positive\n" if $all_zero;
> 
> if we switch to integer:
> 
> die ...
>      if !grep { defined($opts->{$_}) && $opts->{$_} } keys %$opts;
> 
> or we could use the new format validator functionality to ensure this
> directly in parse_property_string ;)
> 

Yes, I felt like here would be a good use case for the validation when 
writing this. And using 'values' instead of 'keys' it becomes even 
simpler. I'll switch to integer and allow 0 as well.

>> +
>> +    my $plugin = PVE::Storage::Plugin->lookup($scfg->{type});
>> +    return $plugin->prune_backups($scfg, $storeid, $prune_opts, $vmid, $dryrun);
>> +}
>> +
>>   sub volume_export {
>>       my ($cfg, $fh, $volid, $format, $snapshot, $base_snapshot, $with_snapshots) = @_;
>>   
>> diff --git a/PVE/Storage/PBSPlugin.pm b/PVE/Storage/PBSPlugin.pm
>> index fba4b2b..f37bd66 100644
>> --- a/PVE/Storage/PBSPlugin.pm
>> +++ b/PVE/Storage/PBSPlugin.pm
>> @@ -173,6 +173,56 @@ sub extract_vzdump_config {
>>       return $config;
>>   }
>>   
>> +sub prune_backups {
>> +    my ($class, $scfg, $storeid, $prune_opts, $vmid, $dryrun) = @_;
>> +
>> +    my $backups = $class->list_volumes($storeid, $scfg, $vmid, ['backup']);
>> +
>> +    my $backup_groups = {};
>> +    foreach my $backup (@{$backups}) {
>> +	(my $guest_type = $backup->{format}) =~ s/^pbs-//;
>> +	my $backup_group = "$guest_type/$backup->{vmid}";
>> +	$backup_groups->{$backup_group} = 1;
>> +    }
>> +
>> +    my @param;
>> +    foreach my $prune_option (keys %{$prune_opts}) {
>> +	push @param, "--$prune_option";
>> +	push @param, "$prune_opts->{$prune_option}";
>> +    }
>> +
>> +    push @param, '--dry-run' if $dryrun;
>> +    push @param, '--output-format=json';
>> +
>> +    my @prune_list;
> 
> why not a reference? would make extending this easier in the future wrt
> return values
> 

Ok

>> +
>> +    foreach my $backup_group (keys %{$backup_groups}) {
>> +	my $json_str;
>> +	my $outfunc = sub { $json_str .= "$_[0]\n" };
> 
> a simple
> 
> sub { $json_str .= shift }
> 
> should be enough, since we don't actually care about the new lines?
> 
>> +
>> +	run_raw_client_cmd($scfg, $storeid, 'prune', [ $backup_group, @param ],
>> +			   outfunc => $outfunc, errmsg => 'proxmox-backup-client failed');
> 
> error handling here
> 
>> +
>> +	my $res = decode_json($json_str);
> 
> and here would be nice (at least to get better error messages, possibly
> to warn and continue with pruning other backup groups and die at the
> end?)
> 
>> +	foreach my $backup (@{$res}) {
>> +	    my $type = $backup->{'backup-type'};
>> +	    my $vmid = $backup->{'backup-id'};
>> +	    my $ctime = $backup->{'backup-time'};
>> +	    my $time_str = localtime($ctime);
>> +	    my $prune_entry = {
>> +		volid => "$type/$vmid-$time_str",
>> +		type => $type eq 'vm' ? 'qemu' : 'lxc',
>> +		ctime => $ctime,
>> +		vmid => $vmid,
>> +		mark => $backup->{keep} ? 'keep' : 'remove',
> 
> depending on how stable this interface is, it might make sense to ensure
> all the expected keys are actually defined..
> 

I'll add an eval around each backup_group iteration and simply warn on 
failure or if some keys are not defined. I'll also switch over to using 
run_client_command instead of run_raw_client_command.

>> +	    };
>> +	    push @prune_list, $prune_entry;
>> +	}
>> +    }
>> +
>> +    return @prune_list;
>> +}
>> +
>>   sub on_add_hook {
>>       my ($class, $storeid, $scfg, %param) = @_;
>>   
>> diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm
>> index 1ba1b99..a11bba5 100644
>> --- a/PVE/Storage/Plugin.pm
>> +++ b/PVE/Storage/Plugin.pm
>> @@ -8,6 +8,7 @@ use File::chdir;
>>   use File::Path;
>>   use File::Basename;
>>   use File::stat qw();
>> +use POSIX qw(strftime);
>>   
>>   use PVE::Tools qw(run_command);
>>   use PVE::JSONSchema qw(get_standard_option register_standard_option);
>> @@ -1167,6 +1168,133 @@ sub check_connection {
>>       return 1;
>>   }
>>   
>> +my $prune_backups_mark = sub {
>> +    my ($prune_entries, $keep_count, $id_func) = @_;
>> +
>> +    return if !defined($keep_count);
> 
> defined is not needed? currently it's >= 1 anyway, and even if we switch
> to >= 0 we'd return on 0 here
> 

Ok

>> +
>> +    my $already_included = {};
>> +
>> +    # determine time covered by previous selections
>> +    foreach my $prune_entry (@{$prune_entries}) {
>> +	my $mark = $prune_entry->{mark};
>> +	if (defined($mark) && $mark eq 'keep') {
>> +	    my $id = $id_func->($prune_entry->{ctime});
>> +	    $already_included->{$id} = 1;
>> +	}
>> +    }
> 
> since $prune_entries needs to be sorted already for all of this to work,
> couldn't we combine the two loops? like so:
> 
> 
> diff --git a/PVE/Storage/Plugin.pm b/PVE/Storage/Plugin.pm
> index d47e837..8d5d24d 100644
> --- a/PVE/Storage/Plugin.pm
> +++ b/PVE/Storage/Plugin.pm
> @@ -1173,22 +1173,18 @@ my $prune_backups_mark = sub {
>       return if !defined($keep_count);
>   
>       my $already_included = {};
> +    my $newly_included = {};
>   
>       # determine time covered by previous selections
>       foreach my $prune_entry (@{$prune_entries}) {
>   	my $mark = $prune_entry->{mark};
> +	my $id = $id_func->($prune_entry->{ctime});
> +
>   	if (defined($mark) && $mark eq 'keep') {
> -	    my $id = $id_func->($prune_entry->{ctime});
>   	    $already_included->{$id} = 1;
>   	}
> -    }
>   
> -    my $newly_included = {};
> -
> -    foreach my $prune_entry (@{$prune_entries}) {
> -	next if defined($prune_entry->{mark});
> -
> -	my $id = $id_func->($prune_entry->{ctime});
> +	next if defined($mark);
>   	next if $already_included->{$id};
>   
>   	if (!$newly_included->{$id}) {
> 
> 

Yes, I think that should work.

>> +
>> +    my $newly_included = {};
>> +
>> +    foreach my $prune_entry (@{$prune_entries}) {
>> +	next if defined($prune_entry->{mark});
>> +
>> +	my $id = $id_func->($prune_entry->{ctime});
>> +	next if $already_included->{$id};
>> +
>> +	if (!$newly_included->{$id}) {
>> +	    last if scalar(keys %{$newly_included}) >= $keep_count;
> 
> shouldn't this one be at the start of the loop body?
> 

If one does that, backups that are in the same time slice as the last 
included backup for that time interval are not marked with 'remove' 
right away. This can be problematic, because a {month,year} is not 
necessarily a superset of a week.

So if there is a backup on a Monday, 31st of December and one on 
Tuesday, 1st of January and the first one is included by 'keep-weekly', 
then the second one should be marked with 'remove' and not considered 
for 'keep-{monthly,yearly}'. I'll add a test case for this edge case.

>> +	    $newly_included->{$id} = 1;
>> +	    $prune_entry->{mark} = 'keep';
>> +	} else {
>> +	    $prune_entry->{mark} = 'remove';
>> +	}
>> +    }
>> +};
>> +
>> +sub prune_backups {
>> +    my ($class, $scfg, $storeid, $prune_opts, $vmid, $dryrun) = @_;
>> +
>> +    my $backups = $class->list_volumes($storeid, $scfg, $vmid, ['backup']);
>> +
>> +    my $backup_groups = {};
>> +    my @prune_list;
>> +    foreach my $backup (@{$backups}) {
>> +	my $volid = $backup->{volid};
>> +	my $backup_vmid = $backup->{vmid};
>> +
>> +	my $prune_entry = {
>> +	    volid => $volid,
>> +	    vmid => $backup_vmid,
>> +	    ctime => $backup->{ctime},
>> +	};
>> +
>> +	my $archive_info = eval { PVE::Storage::archive_info($volid) };
>> +
>> +	my $type = defined($archive_info) ? $archive_info->{type} : 'unknown';
>> +	$prune_entry->{type} = $type;
>> +
>> +	if (defined($archive_info) && $archive_info->{is_std_name}) {
>> +	    $prune_entry->{ctime} = $archive_info->{ctime};
>> +	    my $group = "$type/$backup_vmid";
>> +	    push @{$backup_groups->{$group}}, $prune_entry;
>> +	} else {
>> +	    # ignore backups that don't use the standard naming scheme
>> +	    $prune_entry->{mark} = 'protected';
>> +	}
>> +
>> +	push @prune_list, $prune_entry;
>> +    }
>> +
>> +    foreach my $backup_group (values %{$backup_groups}) {
>> +	my @prune_list_for_group = sort { $b->{ctime} <=> $a->{ctime} }
>> +				   @{$backup_group};
>> +
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-last'}, sub {
>> +	    my ($ctime) = @_;
>> +	    return $ctime;
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-hourly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my (undef, undef, $hour, $day, $month, $year) = localtime($ctime);
>> +	    return "$hour/$day/$month/$year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-daily'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my (undef, undef, undef, $day, $month, $year) = localtime($ctime);
>> +	    return "$day/$month/$year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-weekly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my ($sec, $min, $hour, $day, $month, $year) = localtime($ctime);
>> +	    my $iso_week = int(strftime("%V", $sec, $min, $hour, $day, $month - 1, $year - 1900));
>> +	    my $iso_week_year = int(strftime("%G", $sec, $min, $hour, $day, $month - 1, $year - 1900));
>> +	    return "$iso_week/$iso_week_year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-monthly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my (undef, undef, undef, undef, $month, $year) = localtime($ctime);
>> +	    return "$month/$year";
>> +	});
>> +	$prune_backups_mark->(\@prune_list_for_group, $prune_opts->{'keep-yearly'}, sub {
>> +	    my ($ctime) = @_;
>> +	    my $year = (localtime($ctime))[5];
>> +	    return "$year";
> 
> this is broken. e.g., see this map of volids to $ctime to $year:
> 
> 'local:backup/vzdump-lxc-12345-1900_01_01-01_01_01.tar' => {
>                                                               'ctime' => 946684861,
>                                                               'year' => '100'
>                                                             },
> 'local:backup/vzdump-lxc-12345-1950_01_01-01_01_01.tar' => {
>                                                               'ctime' => 2524608061,
>                                                               'year' => '150'
>                                                             },
> 'local:backup/vzdump-lxc-12345-2000_01_01-01_01_01.tar' => {
>                                                               'ctime' => 946684861,
>                                                               'year' => '100'
>                                                             },
> 'local:backup/vzdump-lxc-12345-2050_01_01-01_01_01.tar' => {
>                                                               'ctime' => 2524608061,
>                                                               'year' => '150'
>                                                             },
> 
> a) we also don't want to implement buggy behaviour that bites us in 30
> years
> b) people might think they are clever and add a backup with a timestamp
> long in the future thinking that will make it safe, and it actually does
> the opposite
> 
> (the bug is in archive_info, and probably needs fixing there. also a
> great example why adding extreme cases to test cases is important, that
> would have hopefully caught this just by virtue of getting sorted wrong)
> 

Ok. I'll add a test for archive_info and fix this.

>> +	});
>> +    }
>> +
>> +    foreach my $prune_entry (@prune_list) {
>> +	$prune_entry->{mark} = 'remove' if !defined($prune_entry->{mark});
>> +    }
>> +
>> +    if (!$dryrun) {
>> +	foreach my $prune_entry (@prune_list) {
>> +	    next if $prune_entry->{mark} ne 'remove';
>> +
>> +	    my $volid = $prune_entry->{volid};
>> +	    eval {
>> +		my (undef, $volname) = parse_volume_id($volid);
>> +		$class->free_image($storeid, $scfg, $volname);
> 
> why not PVE::Storage::archive_remove() to remove the log file as well?
> 

Note that this is an old version of the patch series. archive_remove 
didn't exist yet at this point ;)
But don't worry, all of the other comments/suggestions still apply to 
the latest version.

>> +	    };
>> +	    if (my $err = $@) {
>> +		warn "error when removing backup '$volid' - $err\n";
>> +	    }
>> +	}
>> +    }
>> +
>> +    return @prune_list;
>> +}
>> +
>>   # Import/Export interface:
>>   #   Any path based storage is assumed to support 'raw' and 'tar' streams, so
>>   #   the default implementations will return this if $scfg->{path} is set,
>> diff --git a/test/prune_backups_test.pm b/test/prune_backups_test.pm
>> new file mode 100644
>> index 0000000..1ce1b30
>> --- /dev/null
>> +++ b/test/prune_backups_test.pm
>> @@ -0,0 +1,237 @@
>> +package PVE::Storage::TestPruneBackups;
>> +
>> +use strict;
>> +use warnings;
>> +
>> +use lib qw(..);
>> +
>> +use PVE::Storage;
>> +use Test::More;
>> +use Test::MockModule;
>> +
>> +my $storeid = 'BackTest123';
>> +my @vmids = (1234, 9001);
>> +
>> +# only includes the information needed for prune_backups
>> +my $mocked_backups_list = [];
>> +
>> +foreach my $vmid (@vmids) {
>> +    push @{$mocked_backups_list},
>> +	(
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2018_05_26-11_18_21.tar.zst",
>> +		'ctime' => 1527333501,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_18_21.tar.zst",
>> +		'ctime' => 1577791101,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_19_21.tar.zst",
>> +		'ctime' => 1577787561,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-11_18_21.tar.zst",
>> +		'ctime' => 1577877501,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		'ctime' => 1577881101,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-lxc-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		'ctime' => 1577881101,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	    {
>> +		'volid' => "$storeid:backup/vzdump-$vmid-renamed.tar.zst",
>> +		'ctime' => 1234,
>> +		'vmid'  => $vmid,
>> +	    },
>> +	);
>> +};
>> +my $mock_plugin = Test::MockModule->new('PVE::Storage::Plugin');
>> +$mock_plugin->redefine(list_volumes => sub {
>> +    my ($class, $storeid, $scfg, $vmid, $content_types) = @_;
>> +
>> +    return $mocked_backups_list if !defined($vmid);
>> +
>> +    my @list = grep { $_->{vmid} eq $vmid } @{$mocked_backups_list};
>> +    return \@list;
>> +});
>> +
>> +sub generate_expected {
>> +    my ($vmids, $marks) = @_;
>> +
>> +    my @expected;
>> +    foreach my $vmid (@{$vmids}) {
>> +	push @expected,
>> +	    (
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2018_05_26-11_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1527333501,
>> +		    'mark'  => $marks->[0],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577791101,
>> +		    'mark'  => $marks->[1],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2019_12_31-11_19_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577791161,
>> +		    'mark'  => $marks->[2],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-11_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577877501,
>> +		    'mark'  => $marks->[3],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-qemu-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		    'type'  => 'qemu',
>> +		    'ctime' => 1577881101,
>> +		    'mark'  => $marks->[4],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-lxc-$vmid-2020_01_01-12_18_21.tar.zst",
>> +		    'type'  => 'lxc',
>> +		    'ctime' => 1577881101,
>> +		    'mark'  => $marks->[5],
>> +		    'vmid'  => $vmid,
>> +		},
>> +		{
>> +		    'volid' => "$storeid:backup/vzdump-$vmid-renamed.tar.zst",
>> +		    'type'  => 'unknown',
>> +		    'ctime' => 1234,
>> +		    'mark'  => 'protected',
>> +		    'vmid'  => $vmid,
>> +		},
>> +	    );
>> +    }
>> +    my @sorted = sort { $a->{volid} cmp $b->{volid} } @expected;
>> +    return \@sorted;
>> +}
>> +
>> +# an array of test cases, each test is comprised of the following keys:
>> +# description   => to identify a single test
>> +# vmid          => VMID or undef for all
>> +# prune_opts    => override options from storage configuration
>> +# expected      => what prune_backups should return
>> +#
>> +# most of them are created further below
>> +my $tests = [
>> +    {
>> +	description => 'last=3, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-last' => 3,
>> +	},
>> +	expected => generate_expected(\@vmids, ['remove', 'remove', 'keep', 'keep', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'weekly=2, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-weekly' => 2,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'remove', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'daily=weekly=monthly=1, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-daily' => 1,
>> +	    'keep-weekly' => 1,
>> +	    'keep-monthly' => 1,
>> +	},
>> +	expected => generate_expected(\@vmids, ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'hourly=4, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-hourly' => 4,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'keep', 'keep', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'yearly=2, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-yearly' => 2,
>> +	},
>> +	expected => generate_expected(\@vmids, ['remove', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'last=2,hourly=2 one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-last' => 2,
>> +	    'keep-hourly' => 2,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'keep', 'keep', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'last=1,monthly=2, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-last' => 1,
>> +	    'keep-monthly' => 2,
>> +	},
>> +	expected => generate_expected(\@vmids, ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'monthly=3, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-monthly' => 3,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'last=daily=weekly=1, multiple VMs',
>> +	prune_opts => {
>> +	    'keep-last' => 1,
>> +	    'keep-daily' => 1,
>> +	    'keep-weekly' => 1,
>> +	},
>> +	expected => generate_expected(\@vmids, ['keep', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +    {
>> +	description => 'daily=2, one VM',
>> +	vmid => $vmids[0],
>> +	prune_opts => {
>> +	    'keep-daily' => 2,
>> +	},
>> +	expected => generate_expected([$vmids[0]], ['remove', 'remove', 'keep', 'remove', 'keep', 'keep']),
>> +    },
>> +];
>> +
>> +plan tests => scalar @$tests;
>> +
>> +for my $tt (@$tests) {
>> +
>> +    my $got = eval {
>> +	my @list = PVE::Storage::Plugin->prune_backups($tt->{scfg}, $storeid, $tt->{prune_opts}, $tt->{vmid}, 1);
>> +	my @sorted = sort { $a->{volid} cmp $b->{volid} } @list;
>> +	return \@sorted;
>> +    };
>> +    $got = $@ if $@;
>> +
>> +    is_deeply($got, $tt->{expected}, $tt->{description}) || diag(explain($got));
>> +}
>> +
>> +done_testing();
>> +
>> +1;
>> diff --git a/test/run_plugin_tests.pl b/test/run_plugin_tests.pl
>> index 54322bb..d33429a 100755
>> --- a/test/run_plugin_tests.pl
>> +++ b/test/run_plugin_tests.pl
>> @@ -16,6 +16,7 @@ my $res = $harness->runtests(
>>       "path_to_volume_id_test.pm",
>>       "get_subdir_test.pm",
>>       "filesystem_path_test.pm",
>> +    "prune_backups_test.pm",
>>   );
>>   
>>   exit -1 if !$res || $res->{failed} || $res->{parse_errors};
>> -- 
>> 2.20.1
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH storage 6/6] Add API and pvesm calls for prune_backups
       [not found]   ` <1594040467.ikb3n0sela.astroid@nora.none>
@ 2020-07-08  7:48     ` Fabian Ebner
  2020-07-08  8:17       ` Fabian Grünbichler
  0 siblings, 1 reply; 4+ messages in thread
From: Fabian Ebner @ 2020-07-08  7:48 UTC (permalink / raw)
  To: pve-devel; +Cc: Fabian Grünbichler, Thomas Lamprecht

Am 07.07.20 um 08:46 schrieb Fabian Grünbichler:
> s/VM/guest in most descriptions - this is not in qemu-server ;)
> 
> On June 4, 2020 11:08 am, Fabian Ebner wrote:
>> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>> ---
>>
>> Not sure if this is the best place for the new API endpoints.
>>
>> I decided to opt for two distinct calls rather than just using a
>> --dry-run option and use a worker for actually pruning, because
>> removing many backups over the network might take a while.
> 
> see comments below
> 
>>
>>   PVE/API2/Storage/Makefile        |   2 +-
>>   PVE/API2/Storage/PruneBackups.pm | 153 +++++++++++++++++++++++++++++++
>>   PVE/API2/Storage/Status.pm       |   7 ++
>>   PVE/CLI/pvesm.pm                 |  27 ++++++
>>   4 files changed, 188 insertions(+), 1 deletion(-)
>>   create mode 100644 PVE/API2/Storage/PruneBackups.pm
>>
>> diff --git a/PVE/API2/Storage/Makefile b/PVE/API2/Storage/Makefile
>> index a33525b..3f667e8 100644
>> --- a/PVE/API2/Storage/Makefile
>> +++ b/PVE/API2/Storage/Makefile
>> @@ -1,5 +1,5 @@
>>   
>> -SOURCES= Content.pm Status.pm Config.pm
>> +SOURCES= Content.pm Status.pm Config.pm PruneBackups.pm
>>   
>>   .PHONY: install
>>   install:
>> diff --git a/PVE/API2/Storage/PruneBackups.pm b/PVE/API2/Storage/PruneBackups.pm
>> new file mode 100644
>> index 0000000..7968730
>> --- /dev/null
>> +++ b/PVE/API2/Storage/PruneBackups.pm
>> @@ -0,0 +1,153 @@
>> +package PVE::API2::Storage::PruneBackups;
>> +
>> +use strict;
>> +use warnings;
>> +
>> +use PVE::Cluster;
>> +use PVE::JSONSchema qw(get_standard_option);
>> +use PVE::RESTHandler;
>> +use PVE::RPCEnvironment;
>> +use PVE::Storage;
>> +use PVE::Tools qw(extract_param);
>> +
>> +use base qw(PVE::RESTHandler);
>> +
>> +__PACKAGE__->register_method ({
>> +    name => 'index',
> 
> IMHO this is not an index, but just a regular 'GET' API path.
> 

Ok

>> +    path => '',
>> +    method => 'GET',
>> +    description => "Get prune information for backups.",
>> +    permissions => {
>> +	check => ['perm', '/storage/{storage}', ['Datastore.Audit', 'Datastore.AllocateSpace'], any => 1],
>> +    },
>> +    protected => 1,
>> +    proxyto => 'node',
>> +    parameters => {
>> +	additionalProperties => 0,
>> +	properties => {
>> +	    node => get_standard_option('pve-node'),
>> +	    storage => get_standard_option('pve-storage-id', {
>> +		completion => \&PVE::Storage::complete_storage_enabled,
>> +            }),
>> +	    'prune-backups' => get_standard_option('prune-backups', {
>> +		description => "Use these retention options instead of those from the storage configuration.",
>> +		optional => 1,
>> +	    }),
>> +	    vmid => get_standard_option('pve-vmid', {
>> +		description => "Only consider backups for this VM.",
> 
> of this guest
> 
> would it make sense to allow specification of guest-type as well, so
> that it's possible to indirectly specify the 'backup-group' ?
> 

We don't really support having backups of two kinds of guest with the 
same ID currently, do we? If this is really needed at some point, we can 
still extend this, but for now it should be enough if the such backups 
get grouped correctly as to not interfere with one another.

>> +		optional => 1,
>> +		completion => \&PVE::Cluster::complete_vmid,
>> +	    }),
>> +	},
>> +    },
>> +    returns => {
>> +	type => 'array',
>> +	items => {
>> +	    type => 'object',
>> +	    properties => {
>> +		volid => {
>> +		    description => "Backup volume ID.",
>> +		    type => 'string',
>> +		},
>> +		type => {
>> +		    description => "One of 'qemu', 'lxc', 'openvz' or 'unknown'.",
>> +		    type => 'string',
>> +		},
>> +		'ctime' => {
>> +		    description => "Creation time of the backup (seconds since the UNIX epoch).",
>> +		    type => 'integer',
>> +		},
>> +		'mark' => {
>> +		    description => "Whether the backup would be kept or removed. For backups that don't " .
>> +				   "use the standard naming scheme, it's 'protected'.",
>> +		    type => 'string',
>> +		},
>> +		'vmid' => {
>> +		    description => "The VM the backup belongs to.",
>> +		    type => 'integer',
>> +		    optional => 1,
>> +		},
>> +	    },
>> +	},
>> +    },
>> +    code => sub {
>> +	my ($param) = @_;
>> +
>> +	my $cfg = PVE::Storage::config();
>> +
>> +	my $vmid = extract_param($param, 'vmid');
>> +	my $storeid = extract_param($param, 'storage');
>> +	my $prune_options = extract_param($param, 'prune-backups');
>> +
>> +	my $opts_override;
>> +	if (defined($prune_options)) {
>> +	    $opts_override = PVE::JSONSchema::parse_property_string('prune-backups', $prune_options);
>> +	}
>> +
>> +	my @res = PVE::Storage::prune_backups($cfg, $storeid, $opts_override, $vmid, 1);
>> +	return \@res;
> 
> one more reason to make the return value a reference ;)
> 
>> +    }});
>> +
>> +__PACKAGE__->register_method ({
>> +    name => 'delete',
>> +    path => '',
>> +    method => 'DELETE',
>> +    description => "Prune backups. Only those using the standard naming scheme are considered.",
>> +    permissions => {
>> +	description => "You need the 'Datastore.Allocate' privilege on the storage " .
>> +		       "(or if a VM ID is specified, 'Datastore.AllocateSpace' and 'VM.Backup' for the VM).",
>> +	user => 'all',
>> +    },
>> +    protected => 1,
>> +    proxyto => 'node',
>> +    parameters => {
>> +	additionalProperties => 0,
>> +	properties => {
>> +	    node => get_standard_option('pve-node'),
>> +	    storage => get_standard_option('pve-storage-id', {
>> +                completion => \&PVE::Storage::complete_storage,
>> +            }),
>> +	    'prune-backups' => get_standard_option('prune-backups', {
>> +		description => "Use these retention options instead of those from the storage configuration.",
>> +	    }),
>> +	    vmid => get_standard_option('pve-vmid', {
>> +		description => "Only prune backups for this VM.",
>> +		completion => \&PVE::Cluster::complete_vmid,
>> +		optional => 1,
>> +	    }),
>> +	},
>> +    },
>> +    returns => { type => 'string' },
>> +    code => sub {
>> +	my ($param) = @_;
>> +
>> +	my $rpcenv = PVE::RPCEnvironment::get();
>> +	my $authuser = $rpcenv->get_user();
>> +
>> +	my $cfg = PVE::Storage::config();
>> +
>> +	my $vmid = extract_param($param, 'vmid');
>> +	my $storeid = extract_param($param, 'storage');
>> +	my $prune_options = extract_param($param, 'prune-backups');
>> +
>> +	my $opts_override;
>> +	if (defined($prune_options)) {
>> +	    $opts_override = PVE::JSONSchema::parse_property_string('prune-backups', $prune_options);
>> +	}
>> +
>> +	if (defined($vmid)) {
>> +	    $rpcenv->check($authuser, "/storage/$storeid", ['Datastore.AllocateSpace']);
>> +	    $rpcenv->check($authuser, "/vms/$vmid", ['VM.Backup']);
>> +	} else {
>> +	    $rpcenv->check($authuser, "/storage/$storeid", ['Datastore.Allocate']);
>> +	}
>> +
>> +	my $id = (defined($vmid) ? "$vmid@" : '') . $storeid;
>> +	my $worker = sub {
>> +	    PVE::Storage::prune_backups($cfg, $storeid, $opts_override, $vmid, 0);
> 
> it would be nice to print progress in $PLUGIN::prune_backups (e.g. when
> iterating over the entries before deletion, otherwise neither this API
> call nor the CLI wrapper give any indication which backup archives got
> pruned?
> 

For Plugin.pm one can print something while iterating, but for 
PBSPlugin.pm that's not really possible is it? So maybe just output 
something like "executing proxmox-backup-client prune ..." there?

>> +	};
>> +
>> +	return $rpcenv->fork_worker('prunebackups', $id, $authuser, $worker);
>> +    }});
>> +
>> +1;
>> diff --git a/PVE/API2/Storage/Status.pm b/PVE/API2/Storage/Status.pm
>> index d9d9b36..d12643f 100644
>> --- a/PVE/API2/Storage/Status.pm
>> +++ b/PVE/API2/Storage/Status.pm
>> @@ -11,6 +11,7 @@ use PVE::Cluster;
>>   use PVE::RRD;
>>   use PVE::Storage;
>>   use PVE::API2::Storage::Content;
>> +use PVE::API2::Storage::PruneBackups;
>>   use PVE::RESTHandler;
>>   use PVE::RPCEnvironment;
>>   use PVE::JSONSchema qw(get_standard_option);
>> @@ -18,6 +19,11 @@ use PVE::Exception qw(raise_param_exc);
>>   
>>   use base qw(PVE::RESTHandler);
>>   
>> +__PACKAGE__->register_method ({
>> +    subclass => "PVE::API2::Storage::PruneBackups",
>> +    path => '{storage}/prunebackups',
>> +});
>> +
>>   __PACKAGE__->register_method ({
>>       subclass => "PVE::API2::Storage::Content",
>>       # set fragment delimiter (no subdirs) - we need that, because volume
>> @@ -214,6 +220,7 @@ __PACKAGE__->register_method ({
>>   	    { subdir => 'upload' },
>>   	    { subdir => 'rrd' },
>>   	    { subdir => 'rrddata' },
>> +	    { subdir => 'prunebackups' },
>>   	];
>>   
>>   	return $res;
>> diff --git a/PVE/CLI/pvesm.pm b/PVE/CLI/pvesm.pm
>> index 30bdcf6..478842f 100755
>> --- a/PVE/CLI/pvesm.pm
>> +++ b/PVE/CLI/pvesm.pm
>> @@ -12,8 +12,10 @@ use PVE::Cluster;
>>   use PVE::INotify;
>>   use PVE::RPCEnvironment;
>>   use PVE::Storage;
>> +use PVE::Tools;
>>   use PVE::API2::Storage::Config;
>>   use PVE::API2::Storage::Content;
>> +use PVE::API2::Storage::PruneBackups;
>>   use PVE::API2::Storage::Status;
>>   use PVE::JSONSchema qw(get_standard_option);
>>   use PVE::PTY;
>> @@ -818,6 +820,31 @@ our $cmddef = {
>>   	print "APIVER $res->{apiver}\n";
>>   	print "APIAGE $res->{apiage}\n";
>>       }],
>> +    'prune-backups-list' => [ "PVE::API2::Storage::PruneBackups", 'index', ['storage'], { node => $nodename }, sub {
> 
> I know this is easier since it's a one-to-one mapping to the API
> endpoints, but IMHO this would really make more sense to have as an
> option to 'pvesm prune-backups' from a usability POV..
> 

Ok

>> +	my $res = shift;
>> +
>> +	my @sorted = sort {
>> +	    my $vmcmp = PVE::Tools::safe_compare($a->{vmid}, $b->{vmid}, sub { $_[0] <=> $_[1] });
>> +	    return $vmcmp if $vmcmp ne 0;
>> +	    return $a->{ctime} <=> $b->{ctime};
> 
> should sort by type first in case IDs are re-used between containers and
> VMs
> 

Ok

>> +	} @{$res};
>> +
>> +	my $maxlen = 0;
>> +	foreach my $backup (@sorted) {
>> +	    my $volid = $backup->{volid};
>> +	    $maxlen = length($volid) if length($volid) > $maxlen;
>> +	}
>> +	$maxlen+=1;
>> +
>> +	printf("%-${maxlen}s %15s %10s\n", 'Backup', 'Backup-ID', 'Prune-Mark');
>> +	foreach my $backup (@sorted) {
>> +	    my $type = $backup->{type};
>> +	    my $vmid = $backup->{vmid};
>> +	    my $backup_id = defined($vmid) ? "$type/$vmid" : "$type";
>> +	    printf("%-${maxlen}s %15s %10s\n", $backup->{volid}, $backup_id, $backup->{mark});
>> +	}
>> +    }],
>> +    'prune-backups' => [ "PVE::API2::Storage::PruneBackups", 'delete', ['storage'], { node => $nodename }, ],
>>   };
> 
> for the CLI it would probably be nice to have an interactive mode?
> 
> pvesm prune-backups <storage> [vmid] [--interactive]
> ...
> list of backups and their prune status
> 
> do you want to prune these backups? [y|N]
>    ...
> 

Should this be the default behavior for the CLI command? Then we can 
have something like --yes or --force to skip the question.

> 
> in general it would be nice also for API users if we had a way to
> - get list of to-be-pruned backups (client <-> API)
> - ask user for confirmation (client <-> user)
> - prune exactly those confirmed archives (client <-> API)
> 
> otherwise the dry-run mode is a bit dangerous as it offers a false sense
> of security. the last step maybe even only if all the ones marked as
> keep still exist? I don't know, just some food for thought..
> 
>>   
>>   1;
>> -- 
>> 2.20.1
>>
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>>
> 
> _______________________________________________
> pve-devel mailing list
> pve-devel@pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> 




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [pve-devel] [PATCH storage 6/6] Add API and pvesm calls for prune_backups
  2020-07-08  7:48     ` [pve-devel] [PATCH storage 6/6] Add API and pvesm calls for prune_backups Fabian Ebner
@ 2020-07-08  8:17       ` Fabian Grünbichler
  0 siblings, 0 replies; 4+ messages in thread
From: Fabian Grünbichler @ 2020-07-08  8:17 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel; +Cc: Thomas Lamprecht

On July 8, 2020 9:48 am, Fabian Ebner wrote:
> Am 07.07.20 um 08:46 schrieb Fabian Grünbichler:
>> s/VM/guest in most descriptions - this is not in qemu-server ;)
>> 
>> On June 4, 2020 11:08 am, Fabian Ebner wrote:
>>> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>>> ---
>>>
>>> Not sure if this is the best place for the new API endpoints.
>>>
>>> I decided to opt for two distinct calls rather than just using a
>>> --dry-run option and use a worker for actually pruning, because
>>> removing many backups over the network might take a while.
>> 
>> see comments below
>> 
>>>
>>>   PVE/API2/Storage/Makefile        |   2 +-
>>>   PVE/API2/Storage/PruneBackups.pm | 153 +++++++++++++++++++++++++++++++
>>>   PVE/API2/Storage/Status.pm       |   7 ++
>>>   PVE/CLI/pvesm.pm                 |  27 ++++++
>>>   4 files changed, 188 insertions(+), 1 deletion(-)
>>>   create mode 100644 PVE/API2/Storage/PruneBackups.pm
>>>
>>> diff --git a/PVE/API2/Storage/Makefile b/PVE/API2/Storage/Makefile
>>> index a33525b..3f667e8 100644
>>> --- a/PVE/API2/Storage/Makefile
>>> +++ b/PVE/API2/Storage/Makefile
>>> @@ -1,5 +1,5 @@
>>>   
>>> -SOURCES= Content.pm Status.pm Config.pm
>>> +SOURCES= Content.pm Status.pm Config.pm PruneBackups.pm
>>>   
>>>   .PHONY: install
>>>   install:
>>> diff --git a/PVE/API2/Storage/PruneBackups.pm b/PVE/API2/Storage/PruneBackups.pm
>>> new file mode 100644
>>> index 0000000..7968730
>>> --- /dev/null
>>> +++ b/PVE/API2/Storage/PruneBackups.pm
>>> @@ -0,0 +1,153 @@
>>> +package PVE::API2::Storage::PruneBackups;
>>> +
>>> +use strict;
>>> +use warnings;
>>> +
>>> +use PVE::Cluster;
>>> +use PVE::JSONSchema qw(get_standard_option);
>>> +use PVE::RESTHandler;
>>> +use PVE::RPCEnvironment;
>>> +use PVE::Storage;
>>> +use PVE::Tools qw(extract_param);
>>> +
>>> +use base qw(PVE::RESTHandler);
>>> +
>>> +__PACKAGE__->register_method ({
>>> +    name => 'index',
>> 
>> IMHO this is not an index, but just a regular 'GET' API path.
>> 
> 
> Ok
> 
>>> +    path => '',
>>> +    method => 'GET',
>>> +    description => "Get prune information for backups.",
>>> +    permissions => {
>>> +	check => ['perm', '/storage/{storage}', ['Datastore.Audit', 'Datastore.AllocateSpace'], any => 1],
>>> +    },
>>> +    protected => 1,
>>> +    proxyto => 'node',
>>> +    parameters => {
>>> +	additionalProperties => 0,
>>> +	properties => {
>>> +	    node => get_standard_option('pve-node'),
>>> +	    storage => get_standard_option('pve-storage-id', {
>>> +		completion => \&PVE::Storage::complete_storage_enabled,
>>> +            }),
>>> +	    'prune-backups' => get_standard_option('prune-backups', {
>>> +		description => "Use these retention options instead of those from the storage configuration.",
>>> +		optional => 1,
>>> +	    }),
>>> +	    vmid => get_standard_option('pve-vmid', {
>>> +		description => "Only consider backups for this VM.",
>> 
>> of this guest
>> 
>> would it make sense to allow specification of guest-type as well, so
>> that it's possible to indirectly specify the 'backup-group' ?
>> 
> 
> We don't really support having backups of two kinds of guest with the 
> same ID currently, do we? If this is really needed at some point, we can 
> still extend this, but for now it should be enough if the such backups 
> get grouped correctly as to not interfere with one another.

well, there could be backups for a container with ID 100 that no longer 
exists, and a VM with ID 100 (that might or might not still exist). 
since those are pruned/grouped separately, we might offer this as filter 
here as well. also, I might only want to prune all VM backups right now ;)

> 
>>> +		optional => 1,
>>> +		completion => \&PVE::Cluster::complete_vmid,
>>> +	    }),
>>> +	},
>>> +    },
>>> +    returns => {
>>> +	type => 'array',
>>> +	items => {
>>> +	    type => 'object',
>>> +	    properties => {
>>> +		volid => {
>>> +		    description => "Backup volume ID.",
>>> +		    type => 'string',
>>> +		},
>>> +		type => {
>>> +		    description => "One of 'qemu', 'lxc', 'openvz' or 'unknown'.",
>>> +		    type => 'string',
>>> +		},
>>> +		'ctime' => {
>>> +		    description => "Creation time of the backup (seconds since the UNIX epoch).",
>>> +		    type => 'integer',
>>> +		},
>>> +		'mark' => {
>>> +		    description => "Whether the backup would be kept or removed. For backups that don't " .
>>> +				   "use the standard naming scheme, it's 'protected'.",
>>> +		    type => 'string',
>>> +		},
>>> +		'vmid' => {
>>> +		    description => "The VM the backup belongs to.",
>>> +		    type => 'integer',
>>> +		    optional => 1,
>>> +		},
>>> +	    },
>>> +	},
>>> +    },
>>> +    code => sub {
>>> +	my ($param) = @_;
>>> +
>>> +	my $cfg = PVE::Storage::config();
>>> +
>>> +	my $vmid = extract_param($param, 'vmid');
>>> +	my $storeid = extract_param($param, 'storage');
>>> +	my $prune_options = extract_param($param, 'prune-backups');
>>> +
>>> +	my $opts_override;
>>> +	if (defined($prune_options)) {
>>> +	    $opts_override = PVE::JSONSchema::parse_property_string('prune-backups', $prune_options);
>>> +	}
>>> +
>>> +	my @res = PVE::Storage::prune_backups($cfg, $storeid, $opts_override, $vmid, 1);
>>> +	return \@res;
>> 
>> one more reason to make the return value a reference ;)
>> 
>>> +    }});
>>> +
>>> +__PACKAGE__->register_method ({
>>> +    name => 'delete',
>>> +    path => '',
>>> +    method => 'DELETE',
>>> +    description => "Prune backups. Only those using the standard naming scheme are considered.",
>>> +    permissions => {
>>> +	description => "You need the 'Datastore.Allocate' privilege on the storage " .
>>> +		       "(or if a VM ID is specified, 'Datastore.AllocateSpace' and 'VM.Backup' for the VM).",
>>> +	user => 'all',
>>> +    },
>>> +    protected => 1,
>>> +    proxyto => 'node',
>>> +    parameters => {
>>> +	additionalProperties => 0,
>>> +	properties => {
>>> +	    node => get_standard_option('pve-node'),
>>> +	    storage => get_standard_option('pve-storage-id', {
>>> +                completion => \&PVE::Storage::complete_storage,
>>> +            }),
>>> +	    'prune-backups' => get_standard_option('prune-backups', {
>>> +		description => "Use these retention options instead of those from the storage configuration.",
>>> +	    }),
>>> +	    vmid => get_standard_option('pve-vmid', {
>>> +		description => "Only prune backups for this VM.",
>>> +		completion => \&PVE::Cluster::complete_vmid,
>>> +		optional => 1,
>>> +	    }),
>>> +	},
>>> +    },
>>> +    returns => { type => 'string' },
>>> +    code => sub {
>>> +	my ($param) = @_;
>>> +
>>> +	my $rpcenv = PVE::RPCEnvironment::get();
>>> +	my $authuser = $rpcenv->get_user();
>>> +
>>> +	my $cfg = PVE::Storage::config();
>>> +
>>> +	my $vmid = extract_param($param, 'vmid');
>>> +	my $storeid = extract_param($param, 'storage');
>>> +	my $prune_options = extract_param($param, 'prune-backups');
>>> +
>>> +	my $opts_override;
>>> +	if (defined($prune_options)) {
>>> +	    $opts_override = PVE::JSONSchema::parse_property_string('prune-backups', $prune_options);
>>> +	}
>>> +
>>> +	if (defined($vmid)) {
>>> +	    $rpcenv->check($authuser, "/storage/$storeid", ['Datastore.AllocateSpace']);
>>> +	    $rpcenv->check($authuser, "/vms/$vmid", ['VM.Backup']);
>>> +	} else {
>>> +	    $rpcenv->check($authuser, "/storage/$storeid", ['Datastore.Allocate']);
>>> +	}
>>> +
>>> +	my $id = (defined($vmid) ? "$vmid@" : '') . $storeid;
>>> +	my $worker = sub {
>>> +	    PVE::Storage::prune_backups($cfg, $storeid, $opts_override, $vmid, 0);
>> 
>> it would be nice to print progress in $PLUGIN::prune_backups (e.g. when
>> iterating over the entries before deletion, otherwise neither this API
>> call nor the CLI wrapper give any indication which backup archives got
>> pruned?
>> 
> 
> For Plugin.pm one can print something while iterating, but for 
> PBSPlugin.pm that's not really possible is it? So maybe just output 
> something like "executing proxmox-backup-client prune ..." there?

yes, for PBS it's not really possible since the actual removal is 
delegated to GC anyway.

> 
>>> +	};
>>> +
>>> +	return $rpcenv->fork_worker('prunebackups', $id, $authuser, $worker);
>>> +    }});
>>> +
>>> +1;
>>> diff --git a/PVE/API2/Storage/Status.pm b/PVE/API2/Storage/Status.pm
>>> index d9d9b36..d12643f 100644
>>> --- a/PVE/API2/Storage/Status.pm
>>> +++ b/PVE/API2/Storage/Status.pm
>>> @@ -11,6 +11,7 @@ use PVE::Cluster;
>>>   use PVE::RRD;
>>>   use PVE::Storage;
>>>   use PVE::API2::Storage::Content;
>>> +use PVE::API2::Storage::PruneBackups;
>>>   use PVE::RESTHandler;
>>>   use PVE::RPCEnvironment;
>>>   use PVE::JSONSchema qw(get_standard_option);
>>> @@ -18,6 +19,11 @@ use PVE::Exception qw(raise_param_exc);
>>>   
>>>   use base qw(PVE::RESTHandler);
>>>   
>>> +__PACKAGE__->register_method ({
>>> +    subclass => "PVE::API2::Storage::PruneBackups",
>>> +    path => '{storage}/prunebackups',
>>> +});
>>> +
>>>   __PACKAGE__->register_method ({
>>>       subclass => "PVE::API2::Storage::Content",
>>>       # set fragment delimiter (no subdirs) - we need that, because volume
>>> @@ -214,6 +220,7 @@ __PACKAGE__->register_method ({
>>>   	    { subdir => 'upload' },
>>>   	    { subdir => 'rrd' },
>>>   	    { subdir => 'rrddata' },
>>> +	    { subdir => 'prunebackups' },
>>>   	];
>>>   
>>>   	return $res;
>>> diff --git a/PVE/CLI/pvesm.pm b/PVE/CLI/pvesm.pm
>>> index 30bdcf6..478842f 100755
>>> --- a/PVE/CLI/pvesm.pm
>>> +++ b/PVE/CLI/pvesm.pm
>>> @@ -12,8 +12,10 @@ use PVE::Cluster;
>>>   use PVE::INotify;
>>>   use PVE::RPCEnvironment;
>>>   use PVE::Storage;
>>> +use PVE::Tools;
>>>   use PVE::API2::Storage::Config;
>>>   use PVE::API2::Storage::Content;
>>> +use PVE::API2::Storage::PruneBackups;
>>>   use PVE::API2::Storage::Status;
>>>   use PVE::JSONSchema qw(get_standard_option);
>>>   use PVE::PTY;
>>> @@ -818,6 +820,31 @@ our $cmddef = {
>>>   	print "APIVER $res->{apiver}\n";
>>>   	print "APIAGE $res->{apiage}\n";
>>>       }],
>>> +    'prune-backups-list' => [ "PVE::API2::Storage::PruneBackups", 'index', ['storage'], { node => $nodename }, sub {
>> 
>> I know this is easier since it's a one-to-one mapping to the API
>> endpoints, but IMHO this would really make more sense to have as an
>> option to 'pvesm prune-backups' from a usability POV..
>> 
> 
> Ok
> 
>>> +	my $res = shift;
>>> +
>>> +	my @sorted = sort {
>>> +	    my $vmcmp = PVE::Tools::safe_compare($a->{vmid}, $b->{vmid}, sub { $_[0] <=> $_[1] });
>>> +	    return $vmcmp if $vmcmp ne 0;
>>> +	    return $a->{ctime} <=> $b->{ctime};
>> 
>> should sort by type first in case IDs are re-used between containers and
>> VMs
>> 
> 
> Ok
> 
>>> +	} @{$res};
>>> +
>>> +	my $maxlen = 0;
>>> +	foreach my $backup (@sorted) {
>>> +	    my $volid = $backup->{volid};
>>> +	    $maxlen = length($volid) if length($volid) > $maxlen;
>>> +	}
>>> +	$maxlen+=1;
>>> +
>>> +	printf("%-${maxlen}s %15s %10s\n", 'Backup', 'Backup-ID', 'Prune-Mark');
>>> +	foreach my $backup (@sorted) {
>>> +	    my $type = $backup->{type};
>>> +	    my $vmid = $backup->{vmid};
>>> +	    my $backup_id = defined($vmid) ? "$type/$vmid" : "$type";
>>> +	    printf("%-${maxlen}s %15s %10s\n", $backup->{volid}, $backup_id, $backup->{mark});
>>> +	}
>>> +    }],
>>> +    'prune-backups' => [ "PVE::API2::Storage::PruneBackups", 'delete', ['storage'], { node => $nodename }, ],
>>>   };
>> 
>> for the CLI it would probably be nice to have an interactive mode?
>> 
>> pvesm prune-backups <storage> [vmid] [--interactive]
>> ...
>> list of backups and their prune status
>> 
>> do you want to prune these backups? [y|N]
>>    ...
>> 
> 
> Should this be the default behavior for the CLI command? Then we can 
> have something like --yes or --force to skip the question.

we could have

pvesm prune-backups <storage> [filters]

=> prune without asking

pvesm prune-backups <storage> [filters] --dry-run

=> show what would be pruned

pvesm prune-backups <storage> [filters] --interactive

=> show, ask, (potentially) prune

I am not a big fan of --yes, and --force implies other things to me 
(like attempting to prune as much as possible even in the face of 
errors).

maybe '--confirm' would also be an option instead of '--interactive'?

we could also make it magic based on whether we can prompt on STDIN, 
than script users can just </dev/null

> 
>> 
>> in general it would be nice also for API users if we had a way to
>> - get list of to-be-pruned backups (client <-> API)
>> - ask user for confirmation (client <-> user)
>> - prune exactly those confirmed archives (client <-> API)
>> 
>> otherwise the dry-run mode is a bit dangerous as it offers a false sense
>> of security. the last step maybe even only if all the ones marked as
>> keep still exist? I don't know, just some food for thought..
>> 
>>>   
>>>   1;
>>> -- 
>>> 2.20.1
>>>
>>>
>>> _______________________________________________
>>> pve-devel mailing list
>>> pve-devel@pve.proxmox.com
>>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>>
>>>
>> 
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel@pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>> 
> 




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-08  8:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200604090838.24203-1-f.ebner@proxmox.com>
     [not found] ` <20200604090838.24203-5-f.ebner@proxmox.com>
     [not found]   ` <1594035985.yxpjph40wm.astroid@nora.none>
2020-07-08  7:20     ` [pve-devel] [PATCH storage 4/6] Add prune_backups to storage API Fabian Ebner
2020-07-08  7:26     ` Fabian Ebner
     [not found] ` <20200604090838.24203-7-f.ebner@proxmox.com>
     [not found]   ` <1594040467.ikb3n0sela.astroid@nora.none>
2020-07-08  7:48     ` [pve-devel] [PATCH storage 6/6] Add API and pvesm calls for prune_backups Fabian Ebner
2020-07-08  8:17       ` Fabian Grünbichler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal