From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <s.hanreich@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 7D2D9B0C2
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:40:56 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 5016D2058C
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:40:26 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [94.136.29.106])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:40:23 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 0BC4144445
 for <pbs-devel@lists.proxmox.com>; Wed, 23 Nov 2022 10:40:22 +0100 (CET)
Message-ID: <638d2751-b6a7-5b37-a74b-eefb842e65f8@proxmox.com>
Date: Wed, 23 Nov 2022 10:40:20 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
 Thunderbird/102.5.0
To: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: pbs-devel@lists.proxmox.com
References: <20221104143054.75419-1-s.hanreich@proxmox.com>
 <20221104143054.75419-2-s.hanreich@proxmox.com>
 <20221107110002.yr2vgnmg4e3wagax@casey.proxmox.com>
Content-Language: en-US
From: Stefan Hanreich <s.hanreich@proxmox.com>
In-Reply-To: <20221107110002.yr2vgnmg4e3wagax@casey.proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.566 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] [PATCH proxmox-backup 1/3] Add KeepOptions to Sync
 Job settings
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 23 Nov 2022 09:40:56 -0000


On 11/7/22 12:00, Wolfgang Bumiller wrote:
> On Fri, Nov 04, 2022 at 03:30:52PM +0100, Stefan Hanreich wrote:
>> This enables users to specify prune options when creating/editing a sync
>> job. After a Sync Jobs runs successfully, a prune job with the specified
>> parameters gets started that prunes the synced backup groups. This
>> behaves exactly the same as a normal prune job.
>>
>> Signed-off-by: Stefan Hanreich <s.hanreich@proxmox.com>
>> ---
>>   pbs-api-types/src/jobs.rs         |  7 +++-
>>   src/api2/config/sync.rs           | 66 +++++++++++++++++++++++++++--
>>   src/api2/pull.rs                  | 11 ++++-
>>   src/bin/proxmox-backup-manager.rs | 20 +++++++--
>>   src/server/pull.rs                | 70 +++++++++++++++++++++++++++----
>>   5 files changed, 156 insertions(+), 18 deletions(-)
>>
>> diff --git a/pbs-api-types/src/jobs.rs b/pbs-api-types/src/jobs.rs
>> index 7f029af7..c778039c 100644
>> --- a/pbs-api-types/src/jobs.rs
>> +++ b/pbs-api-types/src/jobs.rs
>> @@ -474,6 +474,9 @@ pub const GROUP_FILTER_LIST_SCHEMA: Schema =
>>           limit: {
>>               type: RateLimitConfig,
>>           },
>> +        keep: {
>> +            type: KeepOptions,
>> +        },
>>           schedule: {
>>               optional: true,
>>               schema: SYNC_SCHEDULE_SCHEMA,
>> @@ -511,6 +514,8 @@ pub struct SyncJobConfig {
>>       pub group_filter: Option<Vec<GroupFilter>>,
>>       #[serde(flatten)]
>>       pub limit: RateLimitConfig,
>> +    #[serde(flatten)]
>> +    pub keep: KeepOptions,
>>   }
>>   
>>   impl SyncJobConfig {
>> @@ -572,7 +577,7 @@ pub struct SyncJobStatus {
>>           },
>>       }
>>   )]
>> -#[derive(Serialize, Deserialize, Default, Updater)]
>> +#[derive(Serialize, Deserialize, Default, Updater, Clone)]
>>   #[serde(rename_all = "kebab-case")]
>>   /// Common pruning options
>>   pub struct KeepOptions {
>> diff --git a/src/api2/config/sync.rs b/src/api2/config/sync.rs
>> index 6f39b239..82231715 100644
>> --- a/src/api2/config/sync.rs
>> +++ b/src/api2/config/sync.rs
>> @@ -7,9 +7,10 @@ use proxmox_router::{http_bail, Permission, Router, RpcEnvironment};
>>   use proxmox_schema::{api, param_bail};
>>   
>>   use pbs_api_types::{
>> -    Authid, SyncJobConfig, SyncJobConfigUpdater, JOB_ID_SCHEMA, PRIV_DATASTORE_AUDIT,
>> -    PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_MODIFY, PRIV_DATASTORE_PRUNE, PRIV_REMOTE_AUDIT,
>> -    PRIV_REMOTE_READ, PROXMOX_CONFIG_DIGEST_SCHEMA,
>> +    Authid, SyncJobConfig, SyncJobConfigUpdater, JOB_ID_SCHEMA,
>> +    PRIV_DATASTORE_AUDIT, PRIV_DATASTORE_BACKUP, PRIV_DATASTORE_MODIFY,
>> +    PRIV_DATASTORE_PRUNE, PRIV_REMOTE_AUDIT, PRIV_REMOTE_READ,
>> +    PROXMOX_CONFIG_DIGEST_SCHEMA
>>   };
>>   use pbs_config::sync;
>>   
>> @@ -216,6 +217,18 @@ pub enum DeletableProperty {
>>       remote_ns,
>>       /// Delete the max_depth property,
>>       max_depth,
>> +    /// Delete keep_last prune option.
>> +    keep_last,
>> +    /// Delete keep_hourly prune option.
>> +    keep_hourly,
>> +    /// Delete keep_daily prune option.
>> +    keep_daily,
>> +    /// Delete keep_weekly prune option.
>> +    keep_weekly,
>> +    /// Delete keep_monthly prune option.
>> +    keep_monthly,
>> +    /// Delete keep_yearly prune option.
>> +    keep_yearly
>>   }
>>   
>>   #[api(
>> @@ -310,6 +323,24 @@ pub fn update_sync_job(
>>                   DeletableProperty::max_depth => {
>>                       data.max_depth = None;
>>                   }
>> +                DeletableProperty::keep_last => {
>> +                    data.keep.keep_last = None;
>> +                }
>> +                DeletableProperty::keep_hourly => {
>> +                    data.keep.keep_hourly = None;
>> +                }
>> +                DeletableProperty::keep_daily => {
>> +                    data.keep.keep_daily = None;
>> +                }
>> +                DeletableProperty::keep_weekly => {
>> +                    data.keep.keep_weekly = None;
>> +                }
>> +                DeletableProperty::keep_monthly => {
>> +                    data.keep.keep_monthly = None;
>> +                }
>> +                DeletableProperty::keep_yearly => {
>> +                    data.keep.keep_yearly = None;
>> +                }
>>               }
>>           }
>>       }
>> @@ -381,6 +412,25 @@ pub fn update_sync_job(
>>           }
>>       }
>>   
>> +    if update.keep.keep_last.is_some() {
>> +        data.keep.keep_last = update.keep.keep_last;
>> +    }
>> +    if update.keep.keep_hourly.is_some() {
>> +        data.keep.keep_hourly = update.keep.keep_hourly;
>> +    }
>> +    if update.keep.keep_daily.is_some() {
>> +        data.keep.keep_daily = update.keep.keep_daily;
>> +    }
>> +    if update.keep.keep_weekly.is_some() {
>> +        data.keep.keep_weekly = update.keep.keep_weekly;
>> +    }
>> +    if update.keep.keep_monthly.is_some() {
>> +        data.keep.keep_monthly = update.keep.keep_monthly;
>> +    }
>> +    if update.keep.keep_yearly.is_some() {
>> +        data.keep.keep_yearly = update.keep.keep_yearly;
>> +    }
>> +
>>       if !check_sync_job_modify_access(&user_info, &auth_id, &data) {
>>           bail!("permission check failed");
>>       }
>> @@ -463,6 +513,8 @@ pub const ROUTER: Router = Router::new()
>>   
>>   #[test]
>>   fn sync_job_access_test() -> Result<(), Error> {
>> +    use pbs_api_types::KeepOptions;
>> +
>>       let (user_cfg, _) = pbs_config::user::test_cfg_from_str(
>>           r###"
>>   user: noperm@pbs
>> @@ -508,6 +560,14 @@ acl:1:/remote/remote1/remotestore1:write@pbs:RemoteSyncOperator
>>           group_filter: None,
>>           schedule: None,
>>           limit: pbs_api_types::RateLimitConfig::default(), // no limit
>> +        keep: KeepOptions {
>> +            keep_last: None,
>> +            keep_hourly: None,
>> +            keep_daily: None,
>> +            keep_weekly: None,
>> +            keep_monthly: None,
>> +            keep_yearly: None
> ^ This could use `KeepOptions::default()`.
>
of course..
>> +        },
>>       };
>>   
>>       // should work without ACLs
>> diff --git a/src/api2/pull.rs b/src/api2/pull.rs
>> index 193f28fe..f39e0b11 100644
>> --- a/src/api2/pull.rs
>> +++ b/src/api2/pull.rs
>> @@ -9,7 +9,7 @@ use proxmox_schema::api;
>>   use proxmox_sys::task_log;
>>   
>>   use pbs_api_types::{
>> -    Authid, BackupNamespace, GroupFilter, RateLimitConfig, SyncJobConfig, DATASTORE_SCHEMA,
>> +    Authid, BackupNamespace, GroupFilter, RateLimitConfig, KeepOptions, SyncJobConfig, DATASTORE_SCHEMA,
>>       GROUP_FILTER_LIST_SCHEMA, NS_MAX_DEPTH_REDUCED_SCHEMA, PRIV_DATASTORE_BACKUP,
>>       PRIV_DATASTORE_PRUNE, PRIV_REMOTE_READ, REMOTE_ID_SCHEMA, REMOVE_VANISHED_BACKUPS_SCHEMA,
>>   };
>> diff --git a/src/server/pull.rs b/src/server/pull.rs
>> index 77caf327..20fda909 100644
>> --- a/src/server/pull.rs
>> +++ b/src/server/pull.rs
>> @@ -1157,5 +1162,54 @@ pub(crate) async fn pull_ns(
>>           };
>>       }
>>   
>> +    if params.keep_options.keeps_something() {
>> +        let result: Result<(), Error> = proxmox_lang::try_block!({
> ^ While we already go this route for `remove_vanished`, I'd prefer to
> have both that one and this be a separate function, as this is starting
> to feel very spaghetti...
Sounds good, I've already refactored it now.
>> +            task_log!(worker, "running prune job");
>> +
>> +            for local_group in list.into_iter() {
>> +                let owner = params.store.get_owner(&target_ns, &local_group)?;
>> +                if check_backup_owner(&owner, &params.owner).is_err() {
>> +                    continue;
>> +                }
>> +
>> +                if let Some(ref group_filter) = &params.group_filter {
>> +                    if !apply_filters(&local_group, group_filter) {
>> +                        continue;
>> +                    }
>> +                }
>> +
>> +                task_log!(worker, "pruning backup group {}", &local_group);
>> +
>> +                let backup_group = params.store.backup_group(target_ns.clone(), local_group);
>> +                let prune_info = compute_prune_info(backup_group.list_backups()?, &params.keep_options)?;
>> +
>> +                for (info, mark) in prune_info {
> I feel like there ought to be a helper for this loop. (probably just
> with a dry_run and a callback parameter would be enough?)
>
> Since we have almost the exact same loop in `src/server/prune_job.rs`
> and in `src/api2/admin/datastore.rs`
>
> Just one has a `dry_run` option and the other want sto collect the info
> in an array for later.

Not sure I understand 100% correctly. Just adding a helper method that 
runs the loop and destroys marked BackupInfos is relatively clear I 
think. The callback should be called for every BackupInfo and then 
receive as arguments: info & mark.

Should the callback replace the keep/destroy functionality, or should it 
be called additionally? I'd tend towards replacing keep/destroy, but 
that would make the dry_run argument kinda useless in the case of 
passing a callback. Except if dry_run also applies to the callback, in 
which case passing a callback is kinda moot. Should the callback 
nevertheless be called when passing dry_run, or should it get ignored as 
well - just producing log output (print info/mark for every BackupInfo)?

Currently I'd tend towards the callback replacing the default 
functionality (destroy marked dirs - which could also just be 
implemented as the default callback if no other callback is supplied) 
and dry_run disabling both the default/supplied callback. But I can see 
arguments for different behavior as well, such that dry_run only applies 
to the keep/destroy functionality. Would be nice to get some input 
regarding this, as I am not 100% sure whether this is what you meant or 
you had a different idea in mind.

>> +                    if mark.keep() {
>> +                        task_log!(worker, "keep {}", info.backup_dir.dir());
>> +                        continue;
>> +                    }
>> +
>> +                    task_log!(worker, "remove {}", info.backup_dir.dir());
>> +
>> +                    if let Err(err) = info.backup_dir.destroy(false) {
>> +                        task_warn!(
>> +                            worker,
>> +                            "failed to remove dir {:?}: {}",
>> +                            info.backup_dir.relative_path(),
>> +                            err,
>> +                        );
>> +                        errors = true;
>> +                    }
>> +                }
>> +            }
>> +            Ok(())
>> +        });
>> +
>> +        if let Err(err) = result {
>> +            task_log!(worker, "error during pruning: {}", err);
>> +            errors = true;
>> +        };
>> +    }
>> +
>>       Ok((progress, errors))
>>   }
>> -- 
>> 2.30.2