From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id CBE0D98A0C for ; Mon, 17 Apr 2023 10:56:22 +0200 (CEST) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id AD5783A040 for ; Mon, 17 Apr 2023 10:56:22 +0200 (CEST) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [94.136.29.106]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS for ; Mon, 17 Apr 2023 10:56:21 +0200 (CEST) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 195D445485 for ; Mon, 17 Apr 2023 10:56:21 +0200 (CEST) Date: Mon, 17 Apr 2023 10:56:13 +0200 From: Fabian =?iso-8859-1?q?Gr=FCnbichler?= To: Proxmox Backup Server development discussion References: <20230413154156.2304040-1-s.hanreich@proxmox.com> <20230413154156.2304040-2-s.hanreich@proxmox.com> In-Reply-To: <20230413154156.2304040-2-s.hanreich@proxmox.com> MIME-Version: 1.0 User-Agent: astroid/0.16.0 (https://github.com/astroidmail/astroid) Message-Id: <1681720666.308m0gkgns.astroid@yuna.none> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-LEVEL: Spam detection results: 0 AWL 0.076 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DMARC_MISSING 0.1 Missing DMARC policy KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record T_SCC_BODY_TEXT_LINE -0.01 - Subject: Re: [pbs-devel] [PATCH proxmox-backup v3 1/3] partial fix #3701: sync job: pull: add transfer-last parameter X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Apr 2023 08:56:22 -0000 On April 13, 2023 5:41 pm, Stefan Hanreich wrote: > Specifying the transfer-last parameter limits the amount of backups > that get synced via the pull command/sync job. The parameter specifies > how many of the N latest backups should get pulled/synced. All other > backups will get skipped. >=20 > This is particularly useful in situations where the sync target has > less disk space than the source. Syncing all backups from the source > is not possible if there is not enough disk space on the target. > Additionally this can be used for limiting the amount of data > transferred, reducing load on the network. >=20 > The newest backup will always get re-synced, regardless of the setting > of the transfer-last parameter. we had some off-list discussion about the logging issue, so mostly rehashing this here.. >=20 > Signed-off-by: Stefan Hanreich > --- > pbs-api-types/src/jobs.rs | 11 +++++ > src/api2/config/sync.rs | 9 ++++ > src/api2/pull.rs | 10 ++++- > src/bin/proxmox-backup-manager.rs | 11 ++++- > src/server/pull.rs | 68 +++++++++++++++++++++++-------- > 5 files changed, 91 insertions(+), 18 deletions(-) >=20 > diff --git a/pbs-api-types/src/jobs.rs b/pbs-api-types/src/jobs.rs > index cf7618c4..b9f57719 100644 > --- a/pbs-api-types/src/jobs.rs > +++ b/pbs-api-types/src/jobs.rs > @@ -444,6 +444,11 @@ pub const GROUP_FILTER_SCHEMA: Schema =3D StringSche= ma::new( > pub const GROUP_FILTER_LIST_SCHEMA: Schema =3D > ArraySchema::new("List of group filters.", &GROUP_FILTER_SCHEMA).sch= ema(); > =20 > +pub const TRANSFER_LAST_SCHEMA: Schema =3D > + IntegerSchema::new("The maximum amount of snapshots to be transferre= d (per group).") nit: the description could also be read as "only transfer X snapshots in this sync invocation", maybe we can find some phrasing that includes - if there are more snapshots, they are skipped - count starts at the end, not at the beginning (implied by "last" in the name) maybe something like "Limit transfer to last N snapshots (per group), skipping others." > + .minimum(1) > + .schema(); > + > #[api( > properties: { > id: { > @@ -493,6 +498,10 @@ pub const GROUP_FILTER_LIST_SCHEMA: Schema =3D > schema: GROUP_FILTER_LIST_SCHEMA, > optional: true, > }, > + "transfer-last": { > + schema: TRANSFER_LAST_SCHEMA, > + optional: true, > + }, > } > )] > #[derive(Serialize, Deserialize, Clone, Updater, PartialEq)] > @@ -522,6 +531,8 @@ pub struct SyncJobConfig { > pub group_filter: Option>, > #[serde(flatten)] > pub limit: RateLimitConfig, > + #[serde(skip_serializing_if =3D "Option::is_none")] > + pub transfer_last: Option, > } > =20 > impl SyncJobConfig { > diff --git a/src/api2/config/sync.rs b/src/api2/config/sync.rs > index bd7373df..01e5f2ce 100644 > --- a/src/api2/config/sync.rs > +++ b/src/api2/config/sync.rs > @@ -215,6 +215,8 @@ pub enum DeletableProperty { > RemoteNs, > /// Delete the max_depth property, > MaxDepth, > + /// Delete the transfer_last property, > + TransferLast, > } > =20 > #[api( > @@ -309,6 +311,9 @@ pub fn update_sync_job( > DeletableProperty::MaxDepth =3D> { > data.max_depth =3D None; > } > + DeletableProperty::TransferLast =3D> { > + data.transfer_last =3D None; > + } > } > } > } > @@ -343,6 +348,9 @@ pub fn update_sync_job( > if let Some(group_filter) =3D update.group_filter { > data.group_filter =3D Some(group_filter); > } > + if let Some(transfer_last) =3D update.transfer_last { > + data.transfer_last =3D Some(transfer_last); > + } > =20 > if update.limit.rate_in.is_some() { > data.limit.rate_in =3D update.limit.rate_in; > @@ -507,6 +515,7 @@ acl:1:/remote/remote1/remotestore1:write@pbs:RemoteSy= ncOperator > group_filter: None, > schedule: None, > limit: pbs_api_types::RateLimitConfig::default(), // no limit > + transfer_last: None, > }; > =20 > // should work without ACLs > diff --git a/src/api2/pull.rs b/src/api2/pull.rs > index b2473ec8..daeba7cf 100644 > --- a/src/api2/pull.rs > +++ b/src/api2/pull.rs > @@ -10,6 +10,7 @@ use pbs_api_types::{ > Authid, BackupNamespace, GroupFilter, RateLimitConfig, SyncJobConfig= , DATASTORE_SCHEMA, > GROUP_FILTER_LIST_SCHEMA, NS_MAX_DEPTH_REDUCED_SCHEMA, PRIV_DATASTOR= E_BACKUP, > PRIV_DATASTORE_PRUNE, PRIV_REMOTE_READ, REMOTE_ID_SCHEMA, REMOVE_VAN= ISHED_BACKUPS_SCHEMA, > + TRANSFER_LAST_SCHEMA, > }; > use pbs_config::CachedUserInfo; > use proxmox_rest_server::WorkerTask; > @@ -76,6 +77,7 @@ impl TryFrom<&SyncJobConfig> for PullParameters { > sync_job.max_depth, > sync_job.group_filter.clone(), > sync_job.limit.clone(), > + sync_job.transfer_last, > ) > } > } > @@ -201,7 +203,11 @@ pub fn do_sync_job( > limit: { > type: RateLimitConfig, > flatten: true, > - } > + }, > + "transfer-last": { > + schema: TRANSFER_LAST_SCHEMA, > + optional: true, > + }, > }, > }, > access: { > @@ -225,6 +231,7 @@ async fn pull( > max_depth: Option, > group_filter: Option>, > limit: RateLimitConfig, > + transfer_last: Option, > rpcenv: &mut dyn RpcEnvironment, > ) -> Result { > let auth_id: Authid =3D rpcenv.get_auth_id().unwrap().parse()?; > @@ -257,6 +264,7 @@ async fn pull( > max_depth, > group_filter, > limit, > + transfer_last, > )?; > let client =3D pull_params.client().await?; > =20 > diff --git a/src/bin/proxmox-backup-manager.rs b/src/bin/proxmox-backup-m= anager.rs > index 740fdc49..b4cb6cb3 100644 > --- a/src/bin/proxmox-backup-manager.rs > +++ b/src/bin/proxmox-backup-manager.rs > @@ -13,7 +13,7 @@ use pbs_api_types::percent_encoding::percent_encode_com= ponent; > use pbs_api_types::{ > BackupNamespace, GroupFilter, RateLimitConfig, SyncJobConfig, DATAST= ORE_SCHEMA, > GROUP_FILTER_LIST_SCHEMA, IGNORE_VERIFIED_BACKUPS_SCHEMA, NS_MAX_DEP= TH_SCHEMA, > - REMOTE_ID_SCHEMA, REMOVE_VANISHED_BACKUPS_SCHEMA, UPID_SCHEMA, > + REMOTE_ID_SCHEMA, REMOVE_VANISHED_BACKUPS_SCHEMA, TRANSFER_LAST_SCHE= MA, UPID_SCHEMA, > VERIFICATION_OUTDATED_AFTER_SCHEMA, > }; > use pbs_client::{display_task_log, view_task_result}; > @@ -272,6 +272,10 @@ fn task_mgmt_cli() -> CommandLineInterface { > schema: OUTPUT_FORMAT, > optional: true, > }, > + "transfer-last": { > + schema: TRANSFER_LAST_SCHEMA, > + optional: true, > + }, > } > } > )] > @@ -287,6 +291,7 @@ async fn pull_datastore( > max_depth: Option, > group_filter: Option>, > limit: RateLimitConfig, > + transfer_last: Option, > param: Value, > ) -> Result { > let output_format =3D get_output_format(¶m); > @@ -319,6 +324,10 @@ async fn pull_datastore( > args["remove-vanished"] =3D Value::from(remove_vanished); > } > =20 > + if transfer_last.is_some() { > + args["transfer-last"] =3D json!(transfer_last) > + } > + > let mut limit_json =3D json!(limit); > let limit_map =3D limit_json > .as_object_mut() > diff --git a/src/server/pull.rs b/src/server/pull.rs > index 65eedf2c..37058f9b 100644 > --- a/src/server/pull.rs > +++ b/src/server/pull.rs > @@ -59,6 +59,8 @@ pub(crate) struct PullParameters { > group_filter: Option>, > /// Rate limits for all transfers from `remote` > limit: RateLimitConfig, > + /// How many snapshots should be transferred at most (taking the new= est N snapshots) > + transfer_last: Option, > } > =20 > impl PullParameters { > @@ -78,6 +80,7 @@ impl PullParameters { > max_depth: Option, > group_filter: Option>, > limit: RateLimitConfig, > + transfer_last: Option, > ) -> Result { > let store =3D DataStore::lookup_datastore(store, Some(Operation:= :Write))?; > =20 > @@ -109,6 +112,7 @@ impl PullParameters { > max_depth, > group_filter, > limit, > + transfer_last, > }) > } > =20 > @@ -537,15 +541,24 @@ async fn pull_snapshot_from( > Ok(()) > } > =20 > +enum SkipReason { > + AlreadySynced, > + TransferLast, > +} > + this could be kept, but > struct SkipInfo { > oldest: i64, > newest: i64, > - count: u64, > + already_synced_count: u64, > + transfer_last_count: u64, > } instead of this, we could just store the reason here. > =20 > impl SkipInfo { > - fn update(&mut self, backup_time: i64) { > - self.count +=3D 1; > + fn update(&mut self, backup_time: i64, skip_reason: SkipReason) { > + match skip_reason { > + SkipReason::AlreadySynced =3D> self.already_synced_count += =3D 1, > + SkipReason::TransferLast =3D> self.transfer_last_count +=3D = 1, > + } this would then not be needed > =20 > if backup_time < self.oldest { > self.oldest =3D backup_time; > @@ -556,8 +569,13 @@ impl SkipInfo { > } > } > =20 > + fn count(&self) -> u64 { > + self.already_synced_count > + .saturating_add(self.transfer_last_count) > + } > + neither would this > fn affected(&self) -> Result { > - match self.count { > + match self.count() { or this > 0 =3D> Ok(String::new()), > 1 =3D> Ok(proxmox_time::epoch_to_rfc3339_utc(self.oldest)?), > _ =3D> Ok(format!( > @@ -574,12 +592,23 @@ impl std::fmt::Display for SkipInfo { > write!( > f, > "skipped: {} snapshot(s) ({}) older than the newest local sn= apshot", > - self.count, > + self.count(), here, only the format string would need to be conditionalized based on the `reason`.. > self.affected().map_err(|_| std::fmt::Error)? > ) > } > } > =20 > +impl Default for SkipInfo { > + fn default() -> Self { > + SkipInfo { > + oldest: i64::MAX, > + newest: i64::MIN, > + already_synced_count: 0, > + transfer_last_count: 0, > + } > + } > +} > + not sure whether this is needed.. probably we'd rather have a constructor taking a `SkipReason`? > /// Pulls a group according to `params`. > /// > /// Pulling a group consists of the following steps: > @@ -632,6 +661,7 @@ async fn pull_group( > let fingerprint =3D client.fingerprint(); > =20 > let last_sync =3D params.store.last_successful_backup(&target_ns, gr= oup)?; > + let last_sync_time =3D last_sync.unwrap_or(i64::MIN); > =20 > let mut remote_snapshots =3D std::collections::HashSet::new(); > =20 > @@ -640,11 +670,14 @@ async fn pull_group( > =20 > progress.group_snapshots =3D list.len() as u64; > =20 > - let mut skip_info =3D SkipInfo { > - oldest: i64::MAX, > - newest: i64::MIN, > - count: 0, > - }; > + let mut skip_info =3D SkipInfo::default(); here, we would then construct 2 `SkipInfo`s, one for "old" snapshots, one for "new, but skipped by transfer-last". > + > + let total_amount =3D list.len(); > + > + let cutoff =3D params > + .transfer_last > + .map(|count| total_amount.saturating_sub(count)) > + .unwrap_or_default(); > =20 > for (pos, item) in list.into_iter().enumerate() { > let snapshot =3D item.backup; > @@ -661,11 +694,14 @@ async fn pull_group( > =20 > remote_snapshots.insert(snapshot.time); > =20 > - if let Some(last_sync_time) =3D last_sync { > - if last_sync_time > snapshot.time { > - skip_info.update(snapshot.time); > - continue; > - } > + if last_sync_time > snapshot.time { > + skip_info.update(snapshot.time, SkipReason::AlreadySynced); > + continue; > + } > + > + if pos < cutoff && last_sync_time !=3D snapshot.time { > + skip_info.update(snapshot.time, SkipReason::TransferLast); > + continue; > } > =20 and ideally here, we could find good criteria for printing the corresponding SkipInfo at the point were we switch from skipping to transferring for both cases.. maybe else if info.count > 0 { print info; clear info; // prevents taking this if again } for the first one and else if pos >=3D cutoff && other_info.count > 0 { .. } for the second one would be enough? that way the sequence of logs would be: - skipped old (happens in most jobs, except for first run) - re-sync of last synced snapshot (if it still exists on source) - skipped because of transfer-last (if set and skips something) - sync of new snapshots (if they exist) which fits the chronological order of the snapshots contained in the log messages, which would be nicer compared to - re-sync last synced snapshot - sync new snapshots - print skip info of old snapshots - print skip info of transfer-last snapshots in my opinion. in any case, tracking the two skip infos separately allows us to - give more information (end of regular skip, start of transfer-last skip - there can be a snapshot inbetween that is re-synced after all) - not handle newlines in a weird way - save a line in the log, since we can just print each on one line, instead of the combined one + two lines for the counts if we ever add more criteria on the snapshot level that cause skipping, we can still re-evaluate and possible have some kind of map/vec for the counts, and move the reason to the key/index, but for two kinds there's no reason to not keep it simple for now IMHO. > // get updated auth_info (new tickets) > @@ -725,7 +761,7 @@ async fn pull_group( > } > } > =20 > - if skip_info.count > 0 { > + if skip_info.count() > 0 { > task_log!(worker, "{}", skip_info); > } > =20 > --=20 > 2.30.2 >=20 >=20 > _______________________________________________ > pbs-devel mailing list > pbs-devel@lists.proxmox.com > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel >=20 >=20 >=20