From: "Fabian Grünbichler" <f.gruenbichler@proxmox.com>
To: Proxmox Backup Server development discussion
<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH v4 proxmox-backup 4/7] fix #4182: server: sync: allow pulling groups concurrently
Date: Mon, 07 Apr 2025 09:21:04 +0200 [thread overview]
Message-ID: <1744009968.ip6vyc3mbq.astroid@yuna.none> (raw)
In-Reply-To: <D8Y1V8BY8F6I.25I1HV13SAN63@proxmox.com>
On April 4, 2025 8:02 pm, Max Carrara wrote:
> On Fri Apr 4, 2025 at 3:49 PM CEST, Christian Ebner wrote:
>> Currently, a sync job sequentially pulls the backup groups and the
>> snapshots contained within them, therefore being limited in download
>> speed by the http2 connection of the source reader instance in case
>> of remote syncs. High latency networks suffer from limited download
>> speed.
>>
>> Improve the throughput by allowing to pull up to a configured number
>> of backup groups concurrently, by creating tasks connecting and
>> pulling from the remote source in parallel.
>>
>> Make the error handling and accounting logic for each group pull
>> reusable by moving it into its own helper function, returning the
>> future.
>>
>> The store progress is placed behind an atomic reference counted mutex
>> to allow for concurrent access of status updates.
>
> Yeah, so... I've got some thoughts about this:
>
> First of all, I think that that's *fine* here, as the
> `Arc<Mutex<StoreProgress>>` probably isn't going to face that much lock
> contention or something anyway. So to get that out of the way, IMO we
> can keep that here as it is right now.
>
> But in the future I do think that we should check the locations where we
> have that kind of concurrent data access / modification, because I feel
> the amount of mutexes is only going to continue growing. That's not a
> bad thing per se, but it does come with a few risks / drawbacks (e.g.
> higher risk for deadlocks).
>
> Without going to deep into the whole "how to avoid deadlocks" discussion
> and other things, here's an alternative I want to propose that could
> perhaps be done in (or as part of a) different series, since it's a bit
> out of scope for this one here. (Though, if you do wanna do it in this
> one, I certainly won't complain!)
>
> First, since `StoreProgress` only contains four `usize`s, it should be
> fairly easy to convert the ones being modified into `AtomicUsize`s and
> perhaps add helper methods to increase their respective values;
> something like this:
>
> #[derive(Debug, Default)]
> /// Tracker for progress of operations iterating over `Datastore` contents.
> pub struct StoreProgress {
> /// Completed groups
> pub done_groups: AtomicUsize,
> /// Total groups
> pub total_groups: u64,
> /// Completed snapshots within current group
> pub done_snapshots: AtomicUsize,
> /// Total snapshots in current group
> pub group_snapshots: u64,
> }
>
> // [...]
>
> impl StoreProgress {
> pub fn add_done_group(&self) {
> let _ = self.done_groups.fetch_add(1, Ordering::Relaxed);
> }
>
> // [...]
> }
>
> (of course, what it all should look like in detail is up to bikeshedding :P)
>
> Something like that would probably be nicer here, because:
>
> - You won't need to wrap `StoreProgress` within an `Arc<Mutex<T>>`
> anymore -- a shared reference is enough, since ...
> - Operations on atomics take &self (that's the whole point of them ofc ;p )
>
> This means that:
> - Cloning an `Arc<T>` is not necessary anymore
> --> should be approx. two atomic ops less times the amount of `Arc`s used
> (on create/clone and drop for each `Arc`)
> - Locking the `Mutex<T>` is also not necessary anymore, which means
> --> should be two atomic ops less for each call to `.lock()`
> (acquire and release)
>
> In turn, this is replaced by a single atomic call with
> `Ordering::Relaxed` (which is fine for counters [0]). So, something like
>
> progress.lock().unwrap().done_groups += 1;
>
> would just become
>
> progress.add_done_group();
>
> which is also quite neat.
>
> Note however that we might have to split that struct into a "local" and
> "shared" version or whatever in order to adapt it all to the current
> code (a locally-used struct ofc doesn't need atomics).
>
> Again, I think what you're doing here is perfectly fine; I just think
> that we should have a look at all of those concurrent data accesses and
> see whether we can slim some stuff down or perhaps have some kind of
> statically enforced mutex ordering for deadlock prevention [1].
>
> [0]: https://doc.rust-lang.org/nomicon/atomics.html#relaxed
> [1]: https://www.youtube.com/watch?v=Ba7fajt4l1M
we do have a similar construct in the client (although still with Arc<>,
maybe we could eliminate it there? ;)):
https://git.proxmox.com/?p=proxmox-backup.git;a=blob;f=pbs-client/src/backup_stats.rs;h=f0563a0011b1117e32027a8a88f9d6f65db591f0;hb=HEAD#l45
_______________________________________________
pbs-devel mailing list
pbs-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
next prev parent reply other threads:[~2025-04-07 7:21 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-04 13:49 [pbs-devel] [PATCH v4 proxmox proxmox-backup 0/7] fix #4182: concurrent group pull/push support for sync jobs Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox 1/7] pbs api types: add 'parallel-groups' to sync job config Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 2/7] client: backup writer: fix upload stats size and rate for push sync Christian Ebner
2025-04-04 18:01 ` Max Carrara
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 3/7] api: config/sync: add optional `parallel-groups` property Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 4/7] fix #4182: server: sync: allow pulling groups concurrently Christian Ebner
2025-04-04 18:02 ` Max Carrara
2025-04-07 7:21 ` Fabian Grünbichler [this message]
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 5/7] server: pull: prefix log messages and add error context Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 6/7] server: sync: allow pushing groups concurrently Christian Ebner
2025-04-04 13:49 ` [pbs-devel] [PATCH v4 proxmox-backup 7/7] server: push: prefix log messages and add additional logging Christian Ebner
2025-04-04 18:01 ` [pbs-devel] [PATCH v4 proxmox proxmox-backup 0/7] fix #4182: concurrent group pull/push support for sync jobs Max Carrara
2025-04-05 9:31 ` Christian Ebner
2025-04-09 10:22 ` Max Carrara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1744009968.ip6vyc3mbq.astroid@yuna.none \
--to=f.gruenbichler@proxmox.com \
--cc=pbs-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal