From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Christian Ebner <c.ebner@proxmox.com>,
Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations
Date: Mon, 14 Oct 2024 11:41:06 +0200 (CEST) [thread overview]
Message-ID: <1609079643.3855.1728898866795@webmail.proxmox.com> (raw)
In-Reply-To: <4aef8355-d0a6-48fe-b413-7a70cfce11c9@proxmox.com>
> Christian Ebner <c.ebner@proxmox.com> hat am 14.10.2024 11:32 CEST geschrieben:
> On 10/10/24 16:48, Fabian Grünbichler wrote:
> >> +// Read fixed or dynamic index and push to target by uploading via the backup writer instance
> >> +//
> >> +// For fixed indexes, the size must be provided as given by the index reader.
> >> +#[allow(clippy::too_many_arguments)]
> >> +async fn push_index<'a>(
> >> + filename: &'a str,
> >> + index: impl IndexFile + Send + 'static,
> >> + chunk_reader: Arc<dyn AsyncReadChunk>,
> >> + backup_writer: &BackupWriter,
> >> + manifest: &mut BackupManifest,
> >> + crypt_mode: CryptMode,
> >> + size: Option<u64>,
> >> + known_chunks: &Arc<Mutex<HashSet<[u8; 32]>>>,
> >> +) -> Result<SyncStats, Error> {
> >> + let (upload_channel_tx, upload_channel_rx) = mpsc::channel(20);
> >> + let mut chunk_infos =
> >> + stream::iter(0..index.index_count()).map(move |pos| index.chunk_info(pos).unwrap());
> >
> > so this iterates over all the chunks in the index..
> >
> >> +
> >> + tokio::spawn(async move {
> >> + while let Some(chunk_info) = chunk_infos.next().await {
> >> + let chunk_info = chunk_reader
> >> + .read_raw_chunk(&chunk_info.digest)
> >
> > and this reads them
> >
> >> + .await
> >> + .map(|chunk| ChunkInfo {
> >> + chunk,
> >> + digest: chunk_info.digest,
> >> + chunk_len: chunk_info.size(),
> >> + offset: chunk_info.range.start,
> >> + });
> >> + let _ = upload_channel_tx.send(chunk_info).await;
> >
> > and sends them further along to the upload code.. which will then (in
> > many cases) throw away all that data we just read because it's already
> > on the target and we know that because of the previous manifest..
> >
> > wouldn't it be better to deduplicate here already, and instead of
> > reading known chunks over and over again, just tell the server to
> > re-register them? or am I missing something here? :)
>
> Good catch, this is indeed a possible huge performance bottleneck!
>
> Did fix this by moving the known chunks check here (as suggested) and
> stream a `MergedChunkInfo` instead of `ChunkInfo`, which allows to only
> send the chunk's digest and size over to the backup writer. By this
> known chunk are never read.
this would be one of the few cases btw where re-uploading a chunk that is missing on the server side (for whatever reason) would be possible - not sure how easy it would be to integrate though ;)
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2024-10-14 9:41 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-12 14:32 [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 01/33] api: datastore: add missing whitespace in description Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 02/33] server: sync: move sync related stats to common module Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 03/33] server: sync: move reader trait to common sync module Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 04/33] server: sync: move source " Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 05/33] client: backup writer: bundle upload stats counters Christian Ebner
2024-10-10 14:49 ` Fabian Grünbichler
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 06/33] client: backup writer: factor out merged chunk stream upload Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 07/33] client: backup writer: add chunk count and duration stats Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 08/33] client: backup writer: allow push uploading index and chunks Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 09/33] server: sync: move skip info/reason to common sync module Christian Ebner
2024-09-12 14:32 ` [pbs-devel] [PATCH v3 proxmox-backup 10/33] server: sync: make skip reason message more genenric Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 11/33] server: sync: factor out namespace depth check into sync module Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 12/33] config: acl: mention optional namespace acl path component Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 13/33] config: acl: allow namespace components for remote datastores Christian Ebner
2024-10-10 14:49 ` Fabian Grünbichler
2024-10-14 8:18 ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 14/33] api types: define remote permissions and roles for push sync Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 15/33] fix #3044: server: implement push support for sync operations Christian Ebner
2024-10-10 14:48 ` Fabian Grünbichler
2024-10-14 9:32 ` Christian Ebner
2024-10-14 9:41 ` Fabian Grünbichler [this message]
2024-10-14 9:53 ` Christian Ebner
2024-10-14 10:01 ` Fabian Grünbichler
2024-10-14 10:15 ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 16/33] config: jobs: add `sync-push` config type for push sync jobs Christian Ebner
2024-10-10 14:48 ` Fabian Grünbichler
2024-10-14 8:16 ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 17/33] api: push: implement endpoint for sync in push direction Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 18/33] api: sync: move sync job invocation to server sync module Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 19/33] api: sync jobs: expose optional `sync-direction` parameter Christian Ebner
2024-10-10 14:48 ` Fabian Grünbichler
2024-10-14 8:10 ` Christian Ebner
2024-10-14 9:25 ` Fabian Grünbichler
2024-10-14 9:36 ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 20/33] api: sync: add permission checks for push sync jobs Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 21/33] bin: manager: add datastore push cli command Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 22/33] ui: group filter: allow to set namespace for local datastore Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 23/33] ui: sync edit: source group filters based on sync direction Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 24/33] ui: add view with separate grids for pull and push sync jobs Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 25/33] ui: sync job: adapt edit window to be used for pull and push Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 26/33] ui: sync: pass sync-direction to allow removing push jobs Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 27/33] ui: sync view: do not use data model proxy for store Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 28/33] ui: sync view: set sync direction when invoking run task via api Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 29/33] datastore: move `BackupGroupDeleteStats` to api types Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 30/33] api types: implement api type for `BackupGroupDeleteStats` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 31/33] datastore: increment deleted group counter when removing group Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 32/33] api: datastore/namespace: return backup groups delete stats on remove Christian Ebner
2024-10-11 9:32 ` Fabian Grünbichler
2024-10-14 10:24 ` Christian Ebner
2024-09-12 14:33 ` [pbs-devel] [PATCH v3 proxmox-backup 33/33] server: sync job: use delete stats provided by the api Christian Ebner
2024-10-11 9:32 ` Fabian Grünbichler
2024-10-15 7:30 ` Christian Ebner
2024-10-15 7:44 ` Fabian Grünbichler
2024-10-15 8:04 ` Christian Ebner
2024-10-10 14:48 ` [pbs-devel] [PATCH v3 proxmox-backup 00/33] fix #3044: push datastore to remote target Fabian Grünbichler
2024-10-11 7:12 ` Christian Ebner
2024-10-11 7:51 ` Fabian Grünbichler
2024-10-14 11:04 ` [pbs-devel] partially-applied: " Fabian Grünbichler
2024-10-17 13:31 ` [pbs-devel] " Christian Ebner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1609079643.3855.1728898866795@webmail.proxmox.com \
--to=f.gruenbichler@proxmox.com \
--cc=c.ebner@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox